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1 Standard integrals, integration by parts 

It is important to grasp some basic techniques for evaluating integrals such as the method of 
substitutions, integration by parts etc. You should refer to Richard Earl's lecture notes, posted on 
the course web page, about a few standard substitutions. 

Before we give some examples, let us introduce some notations. Recall that in A-level math, 
we write integral of a function / as / f{x)dx (indefinite integral), f{x)dx etc. f{x) is called 
the integrand, and the expression f{x)dx is called a differential form (of first order). It will be 
beneficial to introduce the notion of differentials. If y = f{x) is a function of one variable x on 
some interval, then dy = df{x) = f'{x)dx is called the differential of /. The fundamental theorem 
in calculus says that 



where C is an arbitrary constant. 

The chain rule for derivatives implies that the differential of a function is invariant under 
substitutions. More precisely, suppose y = f{x) is a function of x, making substitution x = g{t) so 
that y = f{g{t)) is a function of t . Then dx = g'{t)dt so that 



That is df{x) = df{g{t)) if x = g{t), in other words, when we work out the diff'crcntial df{x) it 
doesn't matter if we consider a: as a variable or as a function of another variable. This principle 
also applies to differential forms of first order. The substitution method then can be summarized 
as the following equality 




dy = nx)dx = f'{g{t))g'{t)dt 
= ^J{g(t))dt. [Chain rule] 



f{x)dx = [Substitute x = ip{t)] / f{(p{t))dip{t) 




There is a similar version for definite integrals. 



Example 1.1 Evaluate I = Jq 



1 dx 



V4-2a;-a;2 ' 
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The integral is close to the integral J ^j^^ which equals sin ^ x up to a constant, so we attempt 
to use this known integral. By completing square we may write 

= 51 



making substitution t = dx = y/bdt, where ^ '■ ^ have 



^ 1 fVE \/bdt f % dt 



fV5 yjhdt _ /■ V 



. -1 2 . _i 1 

= sin — sm 



v/5 V^' 

Now let us recall the technique of integration by parts, which is in many aspects the soul of 
the analysis. Integration by parts is the integral form of the product rule for derivatives, since 

{fgy = f'g + fg' so that 

f{x)g{x) = J g{x)f'{x)dx + J f{x)g'{x)dx 
rearranging the terms to obtain 

J f{x)g'{x)dx = f{x)g{x) - J g{x)f{x)dx. 

Similarly we have 

rb r-b 

f{x)g'{x)dx = f{x)g{x)\\- / g{x)f'{x)dx, 



or in terms of differentials we can rewrite the preceding formula as 

rb 



I 

J a 



f{x)dg{x) = f{x)g{x)\]^ - / g{x)df{x). 

J a 

However there is no general rule to tell us how to split an integrand into g'{x)J{x). 
Example 1.2 Consider I = J xe^dx. Then 

= (x-l)e=" + C. 

Example 1.3 Now let us consider In = J x^e^dx where n = 1, 2, 3, • • • . Using integration by parts 

In = j a;"de^ = x'^e^ - j e'^dx'^ 



n-l 
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which gives a reduction formula. By repeating the use of integration by parts, one can eventually 
work out the result. For example 

/2 = x^e^ - 2/1 = (x^ - 2{x - 1)) + C 

and 

h = xV-3/2 

= [x^-3(x2-2(x- 1))] e^ + C 

etc. 

Example 1.4 Consider In = J cos" xdx where n is a non-negative integer. Split the integrand 
cos" a; into cos""-^ x cos x = cos""-*^ x(sin' x), and perform integration by parts. Then 



In = ycos" ^ x{sm' x)dx = J ^^^"^ ^xdsmx 
= cos"~^xsinx— / sinxdcos""^ X 



r7— 1 

COS X Sin X 



(n — 1) y sin X cos" x{—smx)dx 
cos""^ xsinx + (n — 1) J sin^ xcos"~^ xdx. 



Applying the identity sin^ x = 1 — cos^ x in the last integral, we obtain 

In = cos"~^ xsinx + (n — 1) J {1 — cos^ x) cos'^~^ xdx 
= cos"~"^ xsinx + (n — l)/n-2 — (n — l)In- 
Collecting In together to obtain 

nin = (n — V)In-2 + cos"~^ X sinx 

so that 

n — 1 ^ 1 „_i 

= In-2 H — cos xsmx 

n n 

which reduces the calculation of In to Iq or Ii, both are easy to evaluate. For example 



/ cos" xdx = / cos"~^ xc/xH cos"~^xsinx 

Jo n Jq n 

n-1 r'^ „_2 ^ 

/ cos xdx = ■ ■ ■ 
Jo 



7r/2 




• • • Jq^"^ cos xdx if n = old, 
'^'^2---Jo'^d^ ^fn=even. 



n— 1 n—3 
n n—2 



3 



Example 1.5 Consider I = J e^sinxdx. We have 

I = —J e^dcosx = — cosx + / e^cosxdx 
= —e^ cosx + J e^dsinx 



= —e^ cos x + sin x — J sin xdx 
= — cos X + sin X — / 

so that 

21 = —e^ cos a: + sin X + C. 

2 First order differential equations 

A (ordinary) differential equation is an equation involving an independent variable x, a function 
y{x) and its derivatives: 

F{x,y,y',--- ,y("))=0. 

By solving the highest order derivative j/^"^ in terms of lower order derivatives y^^^ for k < n and 
X, the above equation may be written as 

y(") = /(x,y,2/',--- ,y("-i)). (2.1) 

Such an equation is called an n-th order differential equation. If n = 1, then it is called a first order 
differential equation. Thus a first order differential equation has the general form y' = f{x,y), or 
implicitly F{x, y, y') = 0. 

A function y = (p{x) defined on some interval J is called a solution of (2.1) if 

^^-\x) = f{x,<pix),^'{x),--- ,f^''-'\x)) VxG J. 

A function y = if[x) which contains n independent arbitrary constants Ci, • • • , is called the 
general solution of (2.1) if 1) it is a solution for any arbitrary choice of Ci, • • • , C^, 2) any solution 
of (2.1) has this form. 

The concept of general solutions is not very useful. We are often interested in the so-called initial 
problems or boundary problems. Observe that in order to determine the constants Ci, • • • , C,i in 
general we need n conditions which appear as initial conditions. More precisely, an initial condition 
for n-th order differential equation (2.1) may be formulated as 

y{xo) = 2/0, • • • ,y^"~^^(a;o) = Vn-i 

where xq € J and yo, • • • , yn-i are given data. 

A differential equation is called an (inhomogeneous) linear differential equation, if it is linear in 
y,y' ■ ■ , so that a linear differential equation can be written as the following general form 

an{x)y^''^ + a„_i(x)y("^i) + • • • + ao{x)y = h{x) 

where an, • • • , ao and h are functions of x. li h = 0, then the linear equation is homogenous. 
A first order linear differential equation can be thus put in the following general form 

y' +p{x)y = q{x). 
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2.1 Separable first order DE 

Consider a first order differential equation ^ = f{x, y). It is separable if f{x, y) = a{x)h{y), so that 
^ = a{x)h{y). Dividing the equation by h{y) and multiplying it by dx to separate the variables x 
and y and write the equation to be 

= a(x)ax. 

Ky) 

Integrating both sides of the equation to obtain the solution given by 



which gives in general solutions of a separable equation implicitly. If j/o is a root to h{y) = 0, then 
clearly the constant function y = yQ\s also a solution. 

Example 2.1 Find the general solution to 

x(y2-l) + 2/(x2- 1)^=0. 

The equation is separable and can he rearranged as 



xdx ydv 



x^ — 1 y — 1 

After integration we obtain 

ln|a;^-l|+ln|y2-l| =C 
(C is a constant), which can be put in the form 

{x'-l){y'-l) = C. 

The constant functions y = 1 or y = —1 are solutions but are already included in the above general 
form with C = 0. 

Example 2.2 Find the solution to (1 + e^)yy' = satisfying the initial condition that y(0) = 1. 
The equation is separable: 

1 + 

After integration we obtain the general solution 

= in(l + e-) + C. 

To match the initial condition, we set x = and y = 1 in the general solution to determine the 
constant C = ^ — ln2, so that = ln(l + e^) + | — ln2. After simplification we have 

y' = ln [1(1 + 

Some differential equations of first order can be transformed by proper substitutions to separable 
equations. 
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Example 2.3 Find the general solution to y' = sin(x + y + 1). 

Let u{x) = x + y{x) + 1 so that u' = 1 + y'. The original equation can be formulated a DE of u, 
namely u' = 1 + sin u which is separable. Dividing the equation by 1 + sin u and write the equation 
as 

du 

ax. 



1 + sin 

Integrating the equation we obtain 

du 



1 + sin 

Let us evaluate the integral on the left hand side. 

du f {1 — smu)du 



dx. 



r du _ r 

J l + sinu J (1 + sin'u)(l — sinu) 

/(I — smu)du /" (1 ~ sinu)du 
1 — sin^ u J 



cos^ u 



sin udu 
cos^ u 



f du _ r 

J cos^ u J 

/du f d cos u 1 
— K 1- / ^ — = tan u h G . 
COS^ U J COS^ U COS u 



Therefore 



1 

tan u = X + C 

cosu 



or in terms of y and x, the solution is given by 

tanfa; + y + 1) -. — — = x + C 

cos{x + y + l) 

or 

sin(x + y + 1) — 1 = (x + C) cos(x + y + 1). 
We also have solutions x + y + 1 = 2mr — | where n =integers. 

2.2 Homogenous equations 

Consider a first order differential equation ^ = f(x,y). If the function f(x,y) (of two variables) 
is homogenous, i.e. f{x,y) = h{^) where h is a function of one variable, then we can make a 

substitution u{x) = so that y = xu. The product rule gives that ^ = u + x^, and the 
equation may be written as 

which is separable. 



Example 2.4 Find general solutions to xy' = y^x^ — + y. The equation, by dividing x both 
sides, is homogenous 



X 
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so we make substitution u = ^ and change equation to be 



du r 

u + X— = y \ — -\- u. 
ax 



Rearrange the equation: ^^^^^ = Integrating both sides to obtain 

sin~^ u = ln\x\ + C 

or in terms ofy, general solutions are given by sm~^{^) = In \x\ + C, together with solutions | = 1 
and ^ = — 1. 

X 

Some differential equations of first order can be transformed into homogenous ones by simple 
substitutions. 

For example, consider the following type of first order differential equations 



dx 



f 



aix + biy + ci 
a2X + b2y + C2 



If ci = C2 = then the equation is homogenous, so we consider the case that ci or C2 does not 
vanish. If 

ai hi 
a2 62 







and 61 / 0, then we make substitution u{x) = aix+biy{x) to transform the equation to a separable 
one. For the case where 

ai 61 
a2 62 



we make translation x = t + k and y = z + I such that 

aik + bil + ci = 0, 
a2k + 62Z + C2 = 0. 

Consider f as a new independent variable, and z as a function of t, then 

z{t) = y{x) -l = y{t + k)~l 

therefore, by chain rule, 

dz dy 
dt dx 

The differential equation we are interested becomes 



dz 

dt ~ 

which is homogenous. 

Example 2.5 Find the general solution to 



f 



ait + biz 
a2t + b2Z 



y 



' -2 



y + 2 

x + y -1 
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Solve the linear system 

1 + 2 = 0, 
k+l-l=0 

to obtain I = —2 and k = 3. Let x = t + 3 and y = z — 2. Then the differential equation can be 
written as 



z' = 2 



, 2 

Z 



J + z. 

which is homogenous. Now making standard substitution u{t) = so that z' = u + tu' and 

du 2v? 

u + = ^ 

dt (1 + 

which is separable. Rearrange the equation 

du _ 2u^ - u{l + uf _ n(l + n^) 

~ (1+U)2 ~ ~ (1 + U)2 

and separate the variables to obtain 

ii±4d„ = -*. (2.2) 

Since 



= In |n| + 2 tan ^ u 
therefore, by integrating the equation (2.2) we obtain 

In \u\ + 2 tan"^ u = - \n\t\ + C. 
In terms of x and y the general solution is given by 

Inly + 21 + 2tan-^ = C. 

X — 6 

2.3 Linear differential equations of first order 

Consider a linear differential equation of first order 

dy 



^^+p{x)y = q{x) (2.3) 



where p and q are two continuous functions. The corresponding homogenous equation ^+p{x)z = 
is separable, and has the general solution 

z(x) = Ce-i'f(^)'^^ 
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where J p{x)dx is a primitive of and C is an arbitrary constant. It follows that z{x)e^ p{x)dx 
is a constant, so that 

A(.(x)e/'«-)=0 

which is in turn equivalent to the homogenous equation z' +p{x)z = 0. 

Next we consider the inhomogeneous equation (2.3). The previous discussion suggests to con- 
sider the differential of y{x)e^ ^^^^'^^ , and by employing the product rule for derivatives, we obtain 

— (y(x)ei'P(^)'^^) = ei'^'(^)''^ + p{x)y 

= g(a;)e^f(^)'='^ (2.4) 
so by integrating the equation both sides we obtain 

yjpix)dx^ j q(^j,)e!Pi^)'^^dx + C 

dividing by p{^)dx ^]^g equality to obtain the general solution of (2.3) 

y = e" J g(x)e/*'(^)'^^dx + . (2.5) 

The function e-f ^^'■^^'^'■^ which is multiplied to y to form ye^^ p(x)dx called an integrating factor 
to the inhomogeneous equation (2.3). 

We may describe the above procedure to obtain general solutions for first order linear differential 
equations as following, which includes an idea that can be applied to other different situations, thus 
are worthy of learning. 

Observe that z{x) = e~ ^ p{x)dx jg non-trivial solution to the corresponding homogenous equa- 
tion z' + p{x)z = 0, in order to obtain the general solution to the inhomogeneous one (2.3), we 
make use of the solution z{x): making substitution 

= III (2.6) 

(which is a standard substitution as long as 2 is a known function which has some thing to do with 
the differential equation we are interested. We will use this idea in several occasions later on) , and 
turn (2.3) into a differential equation in u. Of course, according to the explicit form of z(x) we 
have u{x) = y{x)e^ '^^^^'^^ and (2.4) just says that 

v! = g(x)ei'^'(^)'^^ 

which can be integrated to obtain the solution u. 

Example 2.6 Solve differential equation y' + 2xy = 2xe~^^ . 

First work out an integrating factor r{x) = e-^^^'^^ = e^^. Multiplying r{x) both sides the 
equation we obtain 

2 2 

e"" y' + 2x6^ y = 2x 
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that is 

— y = 2x. 
ax 

After integration we have 

e^^y = + C 
so that y = (x^ + C) e~^^ is the general solution. 

Example 2.7 Bernoulli's equation is a non-linear first order equation 

^ +V{x)y = q{x)y'^ 
where n ^ or 1 (but not necessary an integer). 

Dividing by y", the equation becomes 

1 A/ . 

+ p(xV-" = g(x) 



By using transformation z = y^~^ the equation is transformed to a linear equation 

dz 



+ (1 — n)p{x)z = (1 — n)q{x) 



so that 



l-n ^ g-(l-n; 



/*)-((.-„)/„.).<.-»)/*.-...c). 



3 Linear differential equations 

Differential equations of second order play a special role in science. Many physical equations are 
second order ordinary or partial differential equations, such as the dynamics described by Newton's 
law of gravity, fluid dynamics which are determined by the fluid equations: Navier-Stokes equations. 

3.1 Structure of general solutions to linear differential equations 

Let us first describe the structures of solutions to linear differential equations. Recall the general 
linear differential equation of order n is an equation that can be written 

+ • • • + ai{x)y' + ao{x)y = f{x) (3.1) 

where are continuous functions (on some interval) and On / 0. 

Suppose yp is a particular solution of (3.1), then clearly, y is a solution to (3.1) if and only if 
y — yp is a solution to the corresponding homogenous linear DE of n-th order 

an(x)y(") + • • • + ai{x)y' + ao{x)y = 0. (3.2) 

If yi and y2 are two solutions to (3.2), then so is Ayi + iJ,y2, and moreover, there are n linearly 
independent solutions yi, ■ • • , yn of (3.2) such that the general solution 

y = Ciyi H h Cnyn 
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where Ci, ■ ■ ■ , C„ are arbitrary constants. That is, the collection of all solutions to a homogenous 
linear equation of n-th order is a vector space of dimension n. It follows that the general solution 
to (3.1) is given by 

y = Ciyi -\ h CnUn + Up 

where yp is a particular solution of (3.1), Ciyi + • • • + Cny-n is the general solution to the corre- 
sponding homogenous equation (3.2). 

Let us investigate again the general observation we have used to solve general linear differential 
equation of first order. That is, if there is a non-trivial function z(x) which has some connection to 
the differential equation we are interested (for example, for a linear equation, the function may be 
a solution to the corresponding homogenous equation), we can make use of the known function in 
a canonical way by making substitution that u{x) = and work with the differential equation 
that u must satisfy. 

Obviously the constant zero function is a trivial solution to any homogenous linear equation 
which of course give us no additional information. Suppose however we know, say by inspection, 
a non-trivial solution z{x) to the homogenous equation (3.2), then we may reduce the equation 
to a lower order differential equation. Let us demonstrate this idea for homogeneous second order 
differential equations, for simplicity. 

Suppose z{x) 7^ is a non-trivial solution to a homogenous linear differential equation of second 
order 

p{x)^ + q{x)^+r{x)y = 0. (3.3) 

Making the standard substitution u{x) = j^, so that y = uz. Then y' = u'z + uz' and y" = 
u"z + 2u'z' + uz" , substitute these equations to (3.3) we obtain 

p{x) {u' z + 2u z + uz"^ + q{x) {u z + nz') -|- r{x)uz = 0. 

Rearrange the above equation and use the fact that z \s a, solution to (3.3) 

p{x)z{x) + {2p{x)z'{x) + q{x)z{x)) ^ = (3.4) 

which is a homogenous differential equation of first order for unknown function 
Example 3.1 Verify that z{x) = ^ is a solution to 

xy" + 2(1 - x)y' - 2y = 

hence find its general solution. 

Since z' = — and z" = 2x~^ we can easily sec that z is a solution. Making substitution 
y{x) = ^u{x) in the equation we obtain a differential equation for u: 

x--^ + -2x-^x + 2(1 - x)- — = 0. 

X ax^ \ X ax 



Let w — ^ and simplify the above equation: 

dw 



, -2w = ^ 

dx 
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which is separable, and has the general solution w{x) = Cie^^. Integrating w to obtain 

u{x) = J w{x)dx = Cie^^ + C2 

so that 

y{x) = ^{Cie^- + C2) (3.5) 
is the general solution, where Ci and C2 are arbitrary constants. 

Example 3.2 Find the general solution to the inhomogeneous linear equation 

xy" + 2(1 - x)y' -2y = 12x. 

We have found the general solution to the corresponding homogenous equation which is given 
by (3.5), thus, according to the structure of solutions to linear equations, we only need to find a 
particular solution. Since the coefficients of the equation are all polynomials in x so we may look 
for a solution with a form y{x) = ax + b where a, b are constants. Plugging into the equation y" = 0, 
y' = a and y = ax + b into the equation 

2a(l -x)- 2{ax + b) = 12x 

so we should have 2a — 26 = and —2a — 2a = 12 so that a = — 3 and b = —3. Thus yo{x) = — 3x — 3 
is a particular solution, and the general solution thus is given by 

y{x) = ^ (Cie^^ + C2) - 3x - 3. 

3.2 Linear ODE with constant coefRcients 

For homogenous linear ODE with constant coefficients: 

+ a„_i2/("-^) + • • • + aiy' + aoy = (3.6) 

where a„_i, • • • , oq are constants, we can construct its general solution if we can find the roots to 
the auxiliary equation 

m" + a„_im""^ -| h aim + ao = 0. (3.7) 

The auxiliary equation comes from the following observation. Since the derivative of e"^^ is me^^ 
it is thus reasonable to search for a solution y = e"*^. Substitute y^^^ = m^e^^ into (3.6) we have 

(m" + a„_im"~^ H h aim + ao) e""^ = 

thus e"*^ is a solution if and only if m is a root to (3.7) and as long as m is real. If m = a + is a 
complex root of the auxiliary equation, then since the coefficients a„_i, • • • , ao are real numbers, so 
that fh = a — j3i\s also a root. Now the complex functions e™^ and e"*^ both satisfy the differential 
equation (3.6) so that the real part and imaginary parts of 

gmx ^ ^ax cos(/3x) + ze"^ sin(;3x) 
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(Euler's equation) are solutions of (3.6), i.e. if m = a + /3i is a complex root of the auxiliary 
equation, then 

yi{x) = e""" cos(^x) 

and 

2/2(2;) = sin(/3x) 

are a pair of linearly independent solutions of (3.6). 

If m is a repeated root to the auxiliary equation with multiplicity k >2, then e"*^ 
are solutions. The similar conclusion is valid for complex roots. We therefore are able to construct 
n linearly independent solutions to (3.6) via the roots to the auxiliary equation. 



Example 3.3 Consider the harmonic motion described by 

+ w^y = 



where ui ^ is real. The auxiliary equation is rn^ + uP' = Q which has two complex roots m = uii 
and fh. So we have two independent solutions cosojx and sinux and the general solution 

y{x) = A cos ojx + B sin lox 

where A, B are arbitrary constants. 
Example 3.4 Solve the equation 

d^y ^ d'^y _^ dy ^ ^ ^ 
dx^ dx"^ dx 

The auxiliary equation 

vn? — 4m + m + 6 = 
has roots —1,2,3 so the general solution 

y(x) = Cie-^ + Cse^^ + C73e'^ 

The situation for second ED with constant coefficients is particularly simple. Consider the 
homogenous linear equation 

^ + a- + 6y = (3.8) 

where a, b are two real numbers. 

Theorem 3.5 Suppose the auxiliary equation 

n? + am + 6 = 

has two roots m\ and mi . 

1) If mi 7^ 7712 are real, then the general solution is given by 

y{x) = Cie'"^^ + Cse'"^^ . 
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2) If m = mi = 1712 is a repeated real root, then the general solution 

y{x) = {Ci + C2x)e"'\ 

3) If mi = a + ip is a complex root (P ^ 0) so that m2 = a — i/3, then the general solution 

y{x) = e""" (Ci cos + C2 sin px) . 

Proof. Note that a = —{mi + 7712) and b = mim2. We consider 1) and 2) first. In this case 
gmix jg solution, so we make substitution y{x) = u{x)e'^'^^ in the differential equation. Since 

y' = (u' + mm) e"*!^ 

and 

y" = {u" + 2miu' + mlu) e"^^^ 

we obtain 

{u" + 2miu + mfu) + a ((«' + miu)) +bu = 0. 
Using the fact that Ai is a root and that a = —{mi + m2), we have 

u" — (m2 — mi) u' = 0. 

Thus, if m2 — mi 7^ 0, 

u'{x) = Cie("*2-mi)x 

and integrating the equation to obtain 

u{x) = Cie^^^-"''^ + C2 
which proves 1). If m2 — mi = then u" = so by integrating twice to obtain 

u{x) = Ci + C2X 

which shows 2). ■ 

Example 3.6 Solve the differential equation 

d^V ^dv 

-4-2/ + 5y = 0. 
dx'^ dx 

The auxiliary equation m^ — 2m + 5 = has complex root m = 1 + 2z and m, so the general solution 

y{x) = Cie^ cos 2x + C2e^ sin 2x. 
Next wc give some examples for inhomogeneous linear equations. 
Example 3.7 Solve the equation 

—— + 4?/ = sm3x. 
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It is easy to find the general solution to the corresponding homogeneous equation 

whose auxiliary equation + 4 = has two complex roots ±21 Since sin 3a; is the imaginary 
part of e^*^, and 3i is not a root of the auxiliary equation. Thus we search for a particular solution 
yp{x) = ^sinSx. Plugging into the equation we find A = — Hence the general solution 

y{x) = Ci cos 2x + C2 sin 2x — - sin 3x. 

o 

Example 3.8 Consider 

_| +4-f +4y = sin3x. 
dx^ ax 

The auxiliary equation m? + 4m + 4 has a repeated root —2. There is no particular solution 
with a form A sin 3x by a simple inspection, instead we look for a particular solution 

yp{x) = A cos 3x + B sin 3x. 

Then 

+ 12s + 4y4 = 

and 

-9B-12A + AB = 1. 

Thus 



The general solution 



12 „ 5 

A = , B = . 

169' 169 



12 5 
y{x) = {Cix + C2) e-^"^ ~ 169 ~ 169 ^^"^^^^ 



Example 3.9 Let us now consider 

— +4y = sm2x. 

We have seen that sin 2a; is a solution to the corresponding homogenous equation, so we look 
for a particular solution 

yp{x) = Ax cos 2x + Bx sin 2x. 
Then B = and A = so the general solution 

y{x) = Ci cos 2x + C2 sin 2a; — ^a; cos 2a;. 
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Example 3.10 Find a particular solution to 

— — + 4y = sm X + sin 2x . 
ax'' 

By a simple inspection, yi = g sin x is a particular solution to 

— — + 4y = sin X 
ax'' 

and we know from the previous example j/2 = — |2:cos2x is a particular solution to 

—— +4y = sm2x. 
ax'' 

Thus 

yp = - sinx — -X cos 2x 

is a particular solution. 

Example 3.11 Let us consider inhomogeneous linear equation 

d^y ody .c, ^ 

where f{x) is a given function. 

The auxiliary equation m? — 3m + 2 = has two real roots 1 and 2, so the general solution to 
the corresponding homogenous equation is Cie^ + C2e^^. 

1) Suppose f(x) = sinx which is the imaginary part of e*^, since i is not a root of the auxiliary 
equation, so we may search for a particular solution yp = Asinx + Bcosx, but not just Asinx 
which is not good. Feeding yp, y'^ = ^cosx — Bsinx and y'p = —yp into the differential equation 

—yp — S{A cos X — B sinx) + 2yp = sin x 

and collecting the terms of sin x and cos x together to obtain 

{2A + 3B-1) sinx + {2B - 3 A) cosx = 0. 

Set 2A + 35 - 1 = and 2B - 3A = 0, and solve the system to obtain A = and B = ^. The 
general solution is given by 

2 3 

y = Cie"" + C2e^^ + — sinx + — cosx. 

2) f{x) = e"^''' . Since 3 is not a root of the auxiliary equation, so search for a particular solution 
yp = Ae^^. Feeding it into the differential equation: 

{9A -9A + 2A) e^^ = e=^^ 

to obtain a particular solution yp = \e^^ . 
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If however f{x) = e^^, then we may attempt a particular solution yp = Axe^^ as e^^ is a solution 
to the corresponding homogenous equation. Using the equations y'p = Ae^^ + 2yp and 

y; = 2Ae^^ + 2y'p = 4^6^^ + 4yp 

feeding them into the differential equation 

4^6^^ + 42/p - 3 (^e^^ + 2yp) + 2yp = e^^^ 

and collecting the terms e^^ and yp together 

{4A - 3A - 1) e^^ = 

so that A = 1, i.e. yp = xe^^ is a solution, so that the general solution to 

ax'^ ax 

is given by 

y = Cie^ + (726^^ + xe^^ 

3) f{x) = xe^^. Since 2 is a root of the auxiliary equation, so we may search for a particular 
solution in a form yp = {Ax"^ + Bx)e'^^ (we have included Bxe^^ as well, since e^^ is a solution to 
the homogenous equation, but not xe^^). 

4) f{x) = sin re which is the imaginary part of e^^"*"*)^ and 1 + i is not a root of the auxiliary 
equation, so we may search for a particular solution yp = (^cosx + Bsina;)e^. 

5) f{x) = sin^x. Since sin^x = | — ^cos2x, so we may attempt a particular solution with a 
form yp = A + B cos 2x + C sin 2x. 



4 Some facts about matrices 

An m X n matrix A is an array of numbers arranged into m rows and n columns 



/ ail ai2 
a2i a22 



ain \ 

a2n 



and simply written as ^ = (ajj), where aij is the entry in the zth row and jth column. If m = n, 
then A is called a square matrix. 

Let us concentrate on 2 x 2 matrices. You will learn the general theory about matrices in linear 
algebra (topics in your paper Mathematics I). 

First of all, we have elementary operations among 2x2 matrices: if 



A 



an ai2 

a21 0,22 



B 



hii bi2 

b21 &22 
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then we can form a matrix A + B hy simply adding their corresponding entries 



A + B = 

If A is a number we may form a matrix 

XA 



an + 611 ai2 + 612 

021 + hi 0.22 + b22 



Xaii Xai2 
Xa2\ Xa22 



That is ^4 ± i? = (a^j ± hij) and XA = (Xaij). The more interesting operation is the multiplication 
of two matrices, defined as the following 



AB = 



an ai2 \ f hn bu 
021 ^22 J V ^21 622 
011611+012621 011612 + 012622 
021611 + 022621 021612 + 022622 



That is, if AB = {cij) then the entry 

Cij = (Ojl , 0^2 



bij 
b2j 

= dot product of (oa,Oi2) and (6ij, 62j). 



Example 4.1 Let 



A={ M andB=( ^ 



Find A + B,A-B, -A, AB and BA. 

In general we have A + B = B + A, C{A + B) = AC + CB, {AB)C = A{BC), but the 
multiplication of matrices is in general not commutative. 

The determinate of a 2 x 2 matrix A is denoted by det{A) ot \A\ defined by 



det{A) 



On 012 
021 022 



011022 — 012021- 



The mapping A — )• is not additive, but it is multiplicative i.e. 

det{AB) = det{A) det{B) = det{BA). 

We will use I to denote the identity matrix 

1 
1 

It is trivial that I A = AI for any 2x2 matrix. Clearly |/| = 1. 

Given a 2 x 2 matrix A, wc say a 2 x 2 matrix B (if ever exists) is an inverse matrix of A if 
AB = BA = I. Since det{ AB) = dct{A) det(i?), so that a necessary condition for the existence of 
an inverse matrix to A is that det(A) 7^ 0. It turns out this condition is also sufficient. 
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Theorem 4.2 Let A = {aij) be a 2 x 2 matrix. Then A has an inverse matrix if and only if 
det{A) =^ 0. In this case, the inverse matrix is unique and thus is denoted by A~^ , given by 



A- 



0,22 —ai2 



det(^) \ — a2i ail 



Proof. By a direct computation we can see that A ^ defined as above is an inverse matrix. If 
B is an inverse of A, then 

B = B{AA-^) = {BA)A-^ = lA-^ = A'^ 
so the inverse matrix is unique. ■ 

We observe that det{A) = 0, i.e. a\ia22 = a2iai2 means two row vectors ( ^ and ^ 
are proportional, that is, they are linearly dependent. 

Let us consider as the vector space of row vectors ^ ~ ^ (also consider as 2 x 1 matrix). 

Let A = (aij) be a 2 x 2 matrix. Then we associate A a linear mapping from — >■ denoted by 
A and defined by 

' an 0,12 \ f \ ^ ( oiivi + ai2V2 

a21 a22 ) \ V2 ) V 0,2lVi + a22V2 



0.21 J V 022 



Aw 



Proposition 4.3 Let A = (a-y) 6e a 2 x 2 matrix. 

1) The linear system Av = has no zero solutions if and only if dei{A) = 0. 

2) The linear system Av = Av has no trivial solution v ^ if and only if A is an eigenvalue 
of A, that is, det{A — A/) = (which is called the characteristic equation of A). In this case, v is 
called an eigenvector (corresponding to the eigenvalue X). 

A square matrix A = (oij) is diagonal if aij = for any i ^ j. 

Theorem 4.4 Suppose a 2 x 2 matrix A = (a,-,) has distinct real eiqenvalues Ai and Ao, and let 



V2i 



(V1,V2) 

Then 



Vu Vi2 
V2l V22 



Ai 



Proof. First show that P is invertible, which is equivalent to that vi and V2 are linearly 
independent. Suppose avi + /3v2 = 0, so that a^vi + (3Av2 = 0. Hence aAiVi + /3A2V2 = 0. It 
follows that P{X2 — Ai)v2 = 0, so that /3 = and similarly a = 0. Therefore Vi and V2 are linearly 
independent, and exists. 

By definition 

AP = ^(V1,V2) = (^V1,AV2) 

Ai 



= (Aivi, A2V2) = P 



A2 
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Since P ^ exists so that 



Ai 
As 



Example 4.5 Find all the eigenvalues and eigenvectors for the following matrices 

2 1 



6 3 



2 1 
2 



and C 



-1 

1 



5 Systems of linear differential equations 

Consider a linear differential equation of order n: 

y(")+a„_iy("-i) + --- + ao2/ = /(t). 

By introducing functions yk = y^'^^ where k = 0, - ■ ■ , n — 1, the previous linear equation of order 
n is equivalent to the following system of linear equations of first order: 



y-n-i 

y'n-2 



—an-iyn-1 
Vn-l, 



yi- 



aoyo + /(*), 



For example, a second order linear differential equation 

d'^y dy 



is equivalent to the system 



dx 

dt ~ 

dy _ 

dt 



-ax -by + fit) 



X. 



In terms of matrix notations, it can be written as 



dx 
dt 
dy 

dt 



—a —b 
1 



+ 



m 





Example 5.1 Solve the following initial value problem 



dx 
dt 
dy 

di 



3x + y, x(0) = 1; 
6x + 4y, y(0) = 1. 



Method 1. Prom the first equation, substitute y = ^ — 3x to the second equation, to obtain 



cPx dx 
d^~ dt 



dx 

6x + 4— - 12a; 
dt 
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so X solves the homogenous linear equation of second order 

(fx dx 
^-7- + 6x = 

whose auxiliary equation has roots 1 and 6, so x{t) = Cie* + C2e^*. Since x{0) = 1 and x'{0) 
y(0) + Sx{0) = 4 we have 

Ci + C2 = 1, Ci + 6C2 = 4. 
Method 2. By chain rule (or the invariance of first order differentials) we have 

dy ^ Gx + iy 

dx ^ Sx + y 

which is homogenous. By substitution « = |, then y = xu, y' = u + xu', so that 

du Q + Au 

u + X 



dx 3 + u 

which is separable. 

We next describe another method which is contained in the following 

Theorem 5.2 Consider the system of linear equations with constant coefficients 

f \ ^ ( an ai2 \ f x 
fj \a2, a22)\y 

Suppose A = (aij) has two distinct eigenvalues with corresponding eigenvectors 
(k = 1,2). Then the general solution of the system is given by 



V2k 



where Ci and C2 are arbitrary constants. 

Proof. Let P = (vi, V2). Then we have shown that 



P-^AP 

Let 



Ai 
A2 



Then 



-z(t) - ( ^"^^^^ ^ - P-'A ( 



= p-^APz{t) 



Ai \ / zi{t) 

A2 M Z2{t) 
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That is 

z[{t) = Xizi{t) and 4(0 = A2Z2(t) 
so that Zkit) = Cfce^'=* (A; = 1, 2). Hence 

y{t) ) "^Kz^it)) ^^^'^^^U2(i) 
= Cie^i*vi + C2e^2t^2. 



Remark 5.3 If \i = a + (3i is complex (where P ^ 0) andv = Vi + V2i is a (complex) eigenvector, 
then the general solution is given by 



= e°*Ci (vi COS /3t - V2 sin 

y{t) J 

+e"*C2 (v2 cos pt + vi sin 

where C\ , C2 are arbitrary constants. 

Example 5.4 Solve the system of linear differential equations 

t ^ / i\f X 

dt 



-2 3 y V y 



Solve the characteristic equation det(y4 — AI) = 0, i.e. — 3A + 2 = to obtain eigenvalues 
Ai = 1 and A2 = 2. For Ai = 1, solve the linear system 



0-1 1 W ci 
-2 3 - 1 M C2 







to obtain ci = C2, so ( j ^ is an eigenvector with eigenvalue 1. Similarly, solve the linear system 

0-2 1 \{c, ^ ^ 



-2 3-2 y V C2 



to obtain an eigenvector ( ^ ) > so the general solution 



Example 5.5 Solve the system 

l(%\ = ( 

2 -4 y V y 



ft\ f 2 -5\f X 
ay 
dt 



22 



The characteristic equation of the matrix in system 



2- A -5 
2 -4- A 



A^ + 2A + 2 = 



has a pair conjugate complex roots Ai = + i and A2 = — For Ai = — 1 + i the hnear system 

= 



2-Ai -5 
2 -4-Ai 



ci 

C2 



has a solution vector 



thus 



x{t) 

yit) 



e-*Ci 
+C2 



cost 



cost + 





-1 

5 



sint 
sini 



Example 5.6 Solve the initial value problem to the linear system 



dx 

dt 
dy 

H 



2 1 \ / X \ x(0) = 1, 
-4 Q )\y )' y(0) = 1. 



The characteristic equation of the matrix in the system 

(A - 4)2 = 



2 - A 1 
-4 6- A 



has repeated root 4, so e^* is a solution to the system. Taking into account the initial condition, 
we may set x{t) = {At + l)e^* and y{t) = {Bt + l)e^*, and feed them into the system to obtain 
A = —1 and B = —2. Thus the solution to the initial problem is given by 

x{t) = {l-t)e^\ 
y{t) = {l-2t)e^'. 

6 Partial derivatives, chain rule 



Prom this section, we study functions of several variables. 



6.1 Computations of partial derivatives 

Let us begin with a (real) function of two variables, u = f{x,y) defined on an open subset such as 
an open disk, and begin with the partial derivatives of /. By saying a subset U of (resp. M") 
an open subset we mean that if any point p e U there is an open disk (resp. an open ball in M") 
Bp{r) centered at p with radius r > such that Bp{r) C U. 
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Holding y = yo as constant, consider f(x,yo) as a function of x, if its derivative (in x) exists at 
xo, i.e. 

fix,yo) - f{xo,yo) 

x^xo X — Xo 

exists, then its limit is called the partial derivative of / in x at {xo,yo), denoted by one of the 
following notations 

^/(x^po). f^{xo,yo); D^u, Dxf{xo,yo)- 

(It was C.G.J.Jacobi who first proposed to use symbol d instead of d for partial derivatives). 
Similarly we may introduce partial derivative in y, denoted by etc. The definition of partial 
derivatives applies well to functions of three variables, and to functions of several variables. 

Example 6.1 Find partial derivatives for u = y^ . Holding y as constant then u is an exponential 
function and |^ = y^lny, while if hold x as constant, it is a power function so that |^ = xy^~^. 

Example 6.2 Find partial derivatives for u = ^2^y2_|_^2 ■ The results are 

du 1 

+ 



dx {x^ + 2/2 + ^2)2 x^ + y^ + ^2 



7^ + — x'^ 



(x^ + y2 + z^)"^ ' 
du 2xy du 2xz 



dy (^2+y2 + ^2)2' (^2 + y2 + 22)2- 

Example 6.3 Let u = yf{x^ — y^) where f{t) is a differentiable function whose derivative is 
denoted by f'{t). Then 

1 du 1 du u 
x dx y dy y"^ 
In fact, according to the chain rule we have 

g = 2xyf'{x' - y% g = f{x' - y') - 2xyf'{x' - y') 



so that 



1 du Idu 1 9 9. u 
X dx y dy y 



Suppose u = f{x,y) whose partial derivatives |^ and |^ exist on an open subset, so that ^ 
and 1^ are functions of variables x and y. Suppose that the partial derivative in x of |^ exists, 

then = ^(g^) is called the second order partial derivative of u, denoted by or by any of 



the following 



^-^^^^0^^^xx-, fxx{xo, yo); Dl^u, dIJ{xo, yo). 



Similarly ^(|^) is denoted by etc. Higher order derivatives can be defined inductively. 
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Example 6.4 Find partial derivatives of u = tan ^ ^ up to second order. In fact 

du y du X 



dx x^ + y^' dy x^ + y^ 

and 

d^u d [ y \ 2xy 



dx"^ dx\x'^ + y'^J {x'^ + y'^Y' 
d'^u d ( y \ — y^ d'^u 



dxdy dy \x^ + y"^ J (x^ + i/^)^ dydx' 
d'^u d ( x \ 2xy 



dy"^ dy \ x"^ + y"^ J (x^ + y^)^ 
In particular, u solves the Laplace equation 

d'^u d'^u __ ^ 
dx^ dy^ 

We can carry on to find 

d^u _ d ( 2xy \ _ 6xy'^ - 2x^ 
dx^dy dy \ (x^ + y^)^/ (x^ + y^)^ 

6.2 The chain rule 

Let us concentrate on functions of two variables for simplicity, but what we are going to do can be 
generalized to functions of several variables with proper modifications. 

Lemma 6.5 Suppose thatu = f{x, y) defined on an open subset U has first order partial derivatives 
ll and || which are continuous functions on U. Let {xq, yo) & U, Ax = x — Xq, Ay = y — yo and 
Au = f{x,y) - f{xo,yo). Then 

^ df{xo,yo) ^^ ^ df{xo,yo) ^y ^ ^ 
dx dy 

where the remainder a depends on {xo,yo) and {x,y) and 

lim 



{x,y)^{xo,yo) a/ Ax'^ + Ay^ 

That is, a is small in comparison with Ax and Ay, thus the main part of the increment Au at 
(xo,yo) is 

df{xo,yo) ^^ ^ dfixo,yo) ^^ 
dx dy 

which is linear in the increments {Ax, Ay) of independent variables, called the first order differential 
of f at {xo,yo). 
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Proof. By a simple inspection we can see easily that 

^ ^ j 7(a'- !j) - /(J-Q- y) dfjxQ, y) \ 
Ax dx J 

, f{xo,y) - f{xo,yo) _ df{xo,yo) \ 
' Ay dy J y 



+ 



f df{xo,y) _ df{xo,yo) \ 
\ dx dx 



where we have used the convention that, if Ax = (resp. Ay = 0) then the term /(x, y) — /(xq, y) = 
(resp. /(xo,y) - /(xo,yo) = 0) and the term (resp. /(5MW(»)) is regarded as 



zero. Then, since 



are bounded. 



Ax Ay 



and 



VAx2 + Ay2 ' ^Ax2 + Ay2 

f{x,y) - f{xo,y) _ df{xo,y) _^ 

Ax dx 
f{xo,y) - f{xo,yo) _ df{xo,yo) 
Ay dy 

df{xo,y) df{xo,yo) ^ _ 



0, 





dx dx 

as ^Ax2 + Ay2 ^ 0, we obtain 



a 



V^Ax2 + Ay2 



as \/Ax2 + Ay2 ^ o. 



The first order differential oi u = /(x, y), denoted by du or df, and therefore 

df=^lp^dx+^^dy. 
dx dy 

The function 

ff X , 5/(a^o,yo) . X . 9/(xo,yo). ^. 
z = f{xo, yo) H 7^ [x - xo) H (y - yo) 

is the linear approximation of z = f{x,y) near the point (xo,yo)- The above linear equation in 
{x,y,z) represents the tangent plane to the surface graph of the function z = f{x,y) at the point 
{xq, yo, f{xo, yo))- We will return to this topic in the following lectures. 

Lemma 6.6 (Chain rule for two variable functions) Suppose that f{x,y) is function on an open 

subset U C R"^ with continuous partial derivatives ^ and and suppose x = ip(t) and y = il^{t) 
are two diff'erentiable functions on an interval (o, 6) such that {(p{t),ilj{t)) G U for every t G (a, 6). 
Let F{t) = f{x{t),y{t)). Then F is differentiable on [a, b) and 

dF{t) _dfd^{t) ^dldrm ^g_2) 



dt dx dt dy dt 

dy 



where the partials ^ and ^ are evaluated at x = ip(t) and y = ip(t). 



26 



Proof. For to and t in (a, b) we have 



where 
Hence 

Letting At 0, we obtain 



df df 



Ax = <p{t)-(p{to), Ay = tP{t)-tPito). 



F{t) - F{to) ^ 9/Ay a 

At dx At dy At At' 



Ax ,, , Ay ,,, , 
At ^ ^ a! ^ ^ 



JfAxV ^ fAy\'^\At\ 



and therefore 



At VAx2 + Ay2 V VAty ^ VaJ At ^ °' 

i^'(to) = hm^MM 
^ ' t^to At 

df Ax df Ay a 

= -r— iim h iim h iim — 

dx t-^to At dy t^to At t^to At 

dfdip{to)^dfdi;{to) 



dx dt dy dt 
■ 

Suppose we make change of variables: x = (p{s,t) and y = ^(s,t), assume that (p and '0 have 
continuous partial derivatives. Consider F(s,t) = f((p(s,t),il}(s,t)). Holding t as constant and 
applying the chain rule (6.2) to variable s we obtain 

dF df d(p df dtjj 
ds dx ds dy ds 

and similarly 

dF df dip df dip 
dt dx dt dy dt 
In terms of matrices, the chain rule may be put into a neat form 

dFdF-^^fdldl-^f^^ S 

dt(> dtp 
ds dt 



ds^ dt J \dx^ dy 



the 2x2 matrix on the right hand side is called the first order total derivative (or the Jacobian 
matrix) of the transformation x = p{s,t) and y = ip{s,t), denoted by D{ip,tp). 

If all involved functions have continuous higher order partial derivatives, then we may repeat 
the use of the chain rule. 
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Example 6.7 Let F{s,t) = f{(f{s,t),tp{s,t)). Evaluate 
By definition = ^ (^); *o i/tai 

d'^F _ d (dfdip ^ dfdtp 



dtds ds \dx dt dy dt 

d_fdld(p\ d_fdldi^ 
ds \dx dt J ds \dy dt 

d /df \ dip ^ df d^ip 

ds \dx J dt dx dtds 

d {df \ dp df 9V , 
^ds\dy)Tt^dydtds fP-<>duct rule] 

fd^fdp d'^f dil)\dp df d'^ip , df , 

+ 7^.7^ 1^ + l^aTa: rule to —] 



\ dx"^ ds dxdy ds J dt dx dtds dx 
f d'^f dp d^fd'^\d^ df d'^tp , , . , , df , 

-j- I -|- 1 -|- ICflCbXTl TUl€- to /. 

\ dydx ds dy"^ ds J dt dy dtds dy 

Here, the important thing we should keep in mind when working with higher partial derivatives is 
that the symbols and |^ are again functions of x and y, hence of s and t, so we have to apply 
the chain rule to these functions again. 

As a direct consequence of the chain rule, we can show that the first order differentials are 
invariant under substitutions. To be more precise, if F{s,t) = f{p(s,t),ip{s,t)), then 

df , df , dF , dF , 

where 

, dip ^ dip ^ , 9V , dil) ^ 
dx = —ds + -wrdt, dy = —ds + ^at. 
ds dt ds dt 

The invariance of the first differentials under change of variables is useful in evaluating partial 

derivatives, but more importantly, it implies that differentials of functions are globally defined 

objects which do not depend on the coordinates we use to evaluate them. 

Let us write down the chain rule for several variable functions. 

Suppose that f{xi, • • • , Xm) is a function of m variables which has continuous partial derivatives. 
Consider change of variables given by 

Xl = piiti, ■ ■ ■ ,tn), 

: ... : (6.3) 

•^m ~ Pm{ti, ■ ■ ■ , tfi) 

where n G N and (^i, • • • ,<Pm are functions of (ti, • • • ,tn) which have continuous derivatives 
Let 

F{tl, ■■■ ,tn) = fiVlih, ■ ■ ■ ,tn), ■■■ , P^mih, ' ' ' ,tn))- 

Then 

dtj dxi dtj dxm dtj 
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where j = 1,- 



, n. In terms of matrix notations, the chain rule may be put in the following form 



dF 



dF 
dtn 



dl_ 

dXm 



\ 



dipi 


d<fii 


dti 


dtn 


dipm 




dti 


dtn 



(6.5) 



where the m x n matrix on the right-hand side of (6.5) is called the first order total derivative 
associated with the transformation (6.3), denoted by D{(pi,--- ,^Pm)- A careful study about the 
total derivatives for vector valued functions such as (6.3) will be the topic of Part A Option Multi- 
Variable Calculus. 

Example 6.8 Consider u = where x = (p{t) and y = ip{t) so that u{t) = (p{t)'^^^\ According to 
the chain rule 

= ip'{t)yxy-^ +'tp'{t)xy In X 

Example 6.9 Let u = f{x, y, z) have continuous partial derivatives. Let x = rj — Q, y = C ~ C o^f^d 
z = ^ — Tj. Work out the matrix of the first order total derivative for the transformation 



dx 


dx 


dx 




dr) 


% 


oy 


dy 


dy 




dri 


li 




dz 






dr) 








1 


-1 


-1 





1 


1 


-1 






so, by the chain rule, 



df df dl 
d^dv'dC 






1 


-1 


-1 





1 


1 


-1 






dl dl or 

dx^ dy'' dz ^ 



dy dz ' dx dz^ dx dy 



that is. 



du 

d^ 



_dl^di 

dy dz ' 



du 
dr] 



dl 
dx 



df du 
dz' dc 



dl^dl 
dx dy 



Using chain rule again we have 

d'^u 
d^drj 



d_(di^di 

dr] \ dy dz 

d'f d^f 



d_dl ^ d^dl 
dr] dz 



+ 



dr] dy 

d'f 



d'f 



dydx dydz J dzdx dzdz 
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6.3 Partial derivatives for implicit functions 

The chain rule allows us to evaluate partial derivatives for implicit functions. Let us look at an 
example first. 

Example 6.10 Let y = y{x) he the function implicitly given by the equation + y^ = 1 where 
X > and y > 0. Of course by solving y to obtain y = \/l — x'^, thus 

dy 1 2x X 

dx ~ ~2~^/f^ ~ ~7r^^' 

^ hy just use the equation x^ H 
sides of the equation in x, keeping in mind y is a function of x, we obtain 



We can work out the derivative ^ hy just use the equation x^ -\- y^ = 1. Taking derivative both 

n 



so that 

2x + = 

ax 

and solve ^ to obtain ^ = — | which gives just the same answer. 

The idea used in the previous example can be applied to evaluating partial derivatives for 
implicit functions. Suppose that y = y{x) is a function of x implicitly given by an equation 

F{x,y) = 0. 

In order to solve y from the equation to determine the function y = y{x) at least locally, we 
need to impose some conditions. Let us assume the partial derivatives of F (by considering x, y 
as independent variables) exist and are continuous, and assume that §|- 7^ 0. To find out the 

derivative we take derivative both sides of the equation F{x,y) = 0, in x, keep in mind that 
y = y{x) is a function of x. Then 

dF OF dy ^ 

1 ^ = 0. 

dx dy dx 

The left hand side is the result from applying the chain rule to F with x = x and y = y{x). 
SinceFj, ^ 0, by solving ^ we obtain 

dy _ _Fx 

dx Fy 

This idea applies to several variable implicit functions. For exam pie, if z = z{x, y) is function 
implicitly given by the following equation 

F{x,y,z) = 0, 

and if, the partial derivatives of F (considering x, y, z as independent variables) are continuous, 
and Fz ^ 0, then, by taking derivative in x holding y as constant to obtain 

dz 

F^ + F,— = (6.6) 
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so that 

dz _ 

Similarly we have 

dz _ _Fy 

To compute the second partial derivatives, we continue the same procedure. Taking derivative both 
side of (6.6) in y to obtain 



dz / dz\ dz d'^z 

oy \ oy J ox oxoy 



and solving we obtain 

^2 P-LF^^-l/^P-LP dz\ dz 

Q'^Z ^xy^ -CxzQy^ y^zy^ ^zzQyj 

dxdy Fz 

etc., though the formulate become increasingly complicated. 

Finally we mention that the same idea applies to several functions with several variables. For 
example, from the following system 

F(x,y,z) = 0, 

G(x,y,z) = 0, ^^-'^ 

we hope to solve y and z in terms of variable x, thus y = y{x) and z = z{x). By saying that y{x) 
and z{x) are solutions means that if we substitute (y, z) in the system (6.7) by (y(x), z(x)), then 

F{x,y{x),z{x)) = Q, G{x,y{x),z{x)) = Q (6.8) 

hold identically over the range of x. Therefore, by taking derivative on both sides of the 
equations in x, and employing the chain rule, we have 

i^. + i^.^ + i^.^ = 0, + G,^ + G,^ = 0, (6.9) 
ax ax ax ax 

which is a linear system in (^|f ), and can be put in a matrix form, namely 



r< n I \ dz I \ c 



(6.10) 



We may solve ^ and ^ as long as 



det (^^^ j = FyG, - F,Gy ^ 0. (6.11) 

[In fact, near the point (x, y, z) where FyGz — FzGy ^ 0, we can show that (y, z) can be solved from 
the system (6.7) at least locally, which is a part of the conclusion of the so-called Inverse Function 
Theorem. The proper formulation and its proof of the inverse function theorem will be the topics 
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for the Part A option Multi- Variable Calculus in Hilary term]. Indeed, since under the condition 
(6.11) the matrix 



Fy Fz 
Gy Gz 



is invertible, so that 



dx 
dz 
dx 



Fy Fz 

Gy Gz 



P Q — P Q 



J- y^z ^ z^y 



Fx 

G'x 

Gz 

-a 



-Fz 
F. 



'y ^y 
n p _ Q p 

^Z^ X ^x^ z 

p Q — P G 
^ y^x ^ x^y 



Fx 
Go-. 



Hence 



and 



dy _ 
dx 


GzFx- 

FyGz - 


' GxFz _ 

- FzGy 


Fz 
Gz 


Fx 
Gx 


1 


Fz 
Gz 


Fy 

Gy 


dz 
dx 


FyGx 
FyGz — 


FxGy 
FzGy 


Fy 

Gy 


Fx 
Gx 


1 


Fz 
Gz 


Fy 

Gy 



X^ iP' 

1- — H 

a? IP' (? 



Example 6.11 Let y = y{x) and z = z{x) he the functions satisfying the following equations 

1 and X + y + z = Q 

where a; > 0, y > and z > 0. Find the derivatives ^ and 

We may differentiate the equations in x, while keep in mind y and z are functions of x, so 
according to chain rule 

2x 2y dy 2z dz d 



1^ ' dx ' (9- dx dx ' 



+ 



+ 



and 

From (6.13) we obtain 

and substitute it into (6.12) to get 



^ dy dz 
dx dx 



dx 



= 0. 



(6.12) 

(6.13) 



dz 
dx 



dy 
dx 



x y dy z 



6^ dx 



dy 
dx 



from which we may solve t| , hence 



dy 
dx 



3i dz 



y 
1? 



y_ 

b2 



dx 



y_ 

b2 



at those points where — j§ / 0. 
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6.4 Some differential operators 

The symbols for partial differentiation such as J^, etc. may be considered as operations acting 
on functions (which have continuous partial derivatives), sending / to its partial derivatives g^, 
1^ etc. It is useful to be familiar with some differential operators which are used extensively in 
science. 



The symbol V In the n-dimensional Euclidean space M", the symbol V means the total differ- 
entiation. Under the Cartesian coordinate system (xi, • • • ,Xn), V denotes the total derivative 



V 



d 



dxi ' ' dxn 



When V applies to a function f{xi,--- ,Xn) with continuous partial derivatives, V/ means the 
total derivative 

df df 



^^dxi ' ' dxr, 

called the gradient (vector field) of /. V/ may be considered a function taking values in R" (such 
a function is called a vector- valued function, also called a vector field in MJ^ in this special case that 
the number of functions in V/ is exactly the dimension n of R"). 

On the other hand, if 

U(xi, • • • ,Xn) = {U^{xi, ■ ■ ■ ,Xn), ■ ■ ■ ,u"'ixi, ■ ■ ■ , Xn)) 

is a function of n variables defined on U C M", taking values in R" (a vector field on U) then we 
may make dot product between V and u to obtain a real valued function by 





■ M 


\dxi ' 


' dXn) 


dx\ 


dXn 



which is called the divergence of the vector field u. 

We have seen that, if /(xi, • • • scalar function onU C R", then its gradient V/ is a 

vector field, so we may apply dot product between V and V/, to obtain 



V-V/ 







Xdxi 


' dXn) 


dx\ 


dxl 



dxi ' ' dxn 



which is called the Laplacian of /, denoted by A/. Thus, wc introduce the differential operator of 
second order 

A- ^ — 

dxi ^^n 

called the Laplace operator in R". We extend the operation of A to vector valued functions as the 
following. Suppose 

f(a:;i,--- ,Xn) = (/^(xi,--- ,/"'(xi,--- ,Xn)) 
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is a vector valued function (where m G N) defined on U C M", then we define 



Af=(A/i,...,A/-). 

Curl operator Vx In 3-dimensionaI Euchdean space M^, besides the dot product, there is 
another multiplication between vectors called cross products. Recall that, under the Cartesian 
coordinate system (x,y,z), if a = (01,02,03) and b = (61,62,53) then the cross product a x b is 
defined by 

i j k 

a X b = ai 02 03 
61 62 63 

= (0263 - 0362, 0361 - 0163, 0162 - 0261) , 

where i, j, k are the standard basis in M^: i = (1, 0, 0), j = (0, 1, 0) and k = (0, 0, 1). a x b is the 
unique vector which is perpendicular to both a and b obeying the right hand rule, with magnitude 

|a X b| = |a||b| sinZ(a, b), where < Z(a, b) < vr is the angle between a and b. 



02 






ai 


^3 


j + 


ai 


02 


62 


63 


i — 


61 


63 


61 


62 



We apply this definition by replacing a with V = and a vector field u = {u^,u^, 



u 



where u^,u'^,u'^ are functions on U G 
of the vector field u by 



with continuous partial derivatives, and define the curl 



V X u 



i j k 

dx dy dz 
3 



U 



d_ 
dz 



d_ 
dy 



d_ d_ 

dx dz 
1 

U U 



J + 



\ dy dz ^ dz 



du^ du^ dv} 
dx ' dx dy 



V X u is again a vector field on [/ C M^, also called the vorticity of u. 
Example 6.12 Let f{x,y,z) = x"^ + y"^ + z'^ . Compute 

Vf = 2{x,y,z) 

and 

A/ = 2(1 + 1 + 1) = 6 



d_ d_ 
dx dy 



a constant function. 



6.5 Change of coordinates and Jacobians 

Sometimes it is useful to choose a special coordinate system which suites better to a specific 
problem. Suppose (x,y) (respectively {x,y,z) in M^) the Cartesian coordinate system in (resp. 
in M^). Consider another coordinate system {u,v) which are given by equations u = u{x,y) and 
V = v{x,y), and equivalently x = x{u,v) and y = y{u,v). We call the mapping {x,y) {u,v) 
a transformation of coordinates, or change of variables. We only consider those transformations 
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which have continuous partial derivatives. According to chain rule, if f{x, y) is a function with 
continuous partial derivatives, then 



9/ 9/ 
du^ dv 



dl dl 
dx^ dy 



dx dx 



du dv 

The determinate of the first order total derivative (the Jacobian matrix) 



det 



dx dx 

du dv 

dy dy 

du dv 



dx dy dx dy 
du dv dv du 



is called the Jacobian of the transformation, which will be denoted by %^'^\ , i.e 

' d(u,v) ' 



d{x,y) 
d{u, v) 



dx dx 

du dv 

dy dy 

du dv 



which is the density of area elements in a new coordinate system {u, v) in the following sense. 
Suppose the transformation u = u{x, y) and v = v{x, y) which send a domain U in xy-plane one to 
one and onto a domain D in uv-plane, then 



/ f{x,y)dxdy= / f{x{u,v),y{u,v)) 
Ju Jd 



d{x,y) 



d{u, v) 



dudv. 



That is to say, under the transformation {x,y) — t- {u,v), the area element dxdy in the xy-plane is 



equivalent to 



d{x,y) 
d{u,v) 



dudv, where dudv is the area element in uv-plane. 



Example 6.13 (Parabolic coordinate system) The coordinates {u,v) given by the following rela- 
tions ^ 

X = 2^u'^ ~ ^^)' y = 'uv 

are called the parabolic coordinates in the planer. The Jacobian matrix and Jacobian are given by 

d{x,y) 



/ dx 


dx \ 




1 ^" 


dv 


)-{ 


I dy 


dy 




\ du 


dv > 





U —V 



v u J ' d{u,v) 
The transformation (x, y) — )■ {u, v) is conformal in the sense that 



2 I 2 
U +V . 



and 



{dxf + {dyf 



dx^ dy^ 



{udu — vdv)^ + {vdu + udv)^ 
{u^ + v"") ({duf + {dvf) . 



.2 2^ 52 \ 



\dv? dv'^ ) 
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6.5.1 Polar coordinate system 

If P = (x, y) G and {x, y) 7^ (0, 0), then we can determinate the position of (x, y) by its distance 
r to (0, 0) and the angle 9 from x-axis to oP , so that X = r cos 9 and y = r sin 0. Then the Jacobian 
matrix 

cos 9 —r sin 9 
sin r cos 

so that its Jacobian ^jj^^ = r. 

On the other hand, r = y^x^ + and tan^ = ^. The Jacobian matrix of this transformation 
is given by 



/ dx 


dx \ 




( % 


ae 


H 




oy 




\ dr 


ae > 





( dL 


dr \ 




I dx 






ae 




H 


\ dx 


dy J 






a; +2/ a; +J/ / 

so that its Jacobian = , I ^ = ^ . Hence 

d{x,y) d{r,9) ^ ^ 
d{r,9)d{x,y) 

If f{x,y) is a function with continuous partial derivatives, and F{r, 9) = f{rcos9,rsm9), then 

cos^S+sin^l, 

-rsin^ll+rcos^f . ^^''^^ 

It is also useful to express and ^ in terms of ^ and g^, which can be achieved by solving 
g^, from the above linear system: 

91 df] ^ cos^ -rsin^ 

dx' dy) ~ \dr' 89 ) \ sm9 rcos9 

_ I f df df \ f rcos9 rsm.9 
^ ' \dr'^ J \ -sm.9 cos9 

Idf ldf\ / rcos9 rs'm 
r dr' r 89 J \ — sin 9 cos t 

cos9- sm6'— ,sm6'— + - cose*— 

8r r 89 8r r 89 



r 



that is 



It follows directly from (6.15) that 



oosf^ _ sine 9/ 

% , co^.gf (6.15) 



sin (9^ + cose 3/ 



dfV fdfY fdfY 1 fdf 



8xj +UJ ^ w +^ UJ ■ ^^-'^^ 
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If f{x, y) is a function with continuous partial derivatives up to order 2, then 

A/ 



dx^ dip- ' 



We wish to work out the Laplace operator in the polar coordinate system. To this end, we continue 
to compute the second order partial derivatives. In fact, 

= cos6'— +sm6'— 

or^ or \ ax oy 



and 



Hence 



in other words 



502 



COS0— -^+sm0— ^ 
or ox or oy 



cos 9 



d^f d^f 
cos6';^ +sm6'— 
ox'^ oyox 



+ sm9 



oxoy oy' 

d'f 



cos^ 0-4 + 2 sin cos O—^r + sin^ 0-4, 
ox oxoy oy 



de 



-rsmd— — h rcos0— 
ox Oy 



—r cos ( 
— r sin^ 
— r cos ( 
— r sin0 
+r cos 6 



dx 

d df 



r sin( 



M 

dy 



n d df 
do dx do dy 



dx 



r sm( 



M 

dy 



-rsmO—-^ + rcos0--— - 
ax^ dxdy 

52 / d'^f 
-r sin 07—— + rcos07-7T 
dydx oy^ 



r'sin'O^ 
dx'' 



ndf . .df 
—rcosu- rsmc' 



2r 2 sin cos -—4- + cos^ 

dxdy dy^ 



dx 



Qj.2 J.2 QQ2 



dy 



9^/^92/ 

Qy2 

9V^92/ 

dx^ dy^ 



cos 6 df sin df 



T dx 
Idl 
r dr 



dy 



9V^92/ 

dx^ dy^ 



d^f ^ I d'^f ^ Idf 

Qj.2 J.2 QQ2 j. Qj. 



(6.17) 
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Therefore, under the polar coordinate system (r, 9) the Laplace operator 



(6.18) 



6.5.2 Cylindrical coordinate system in 



Let (x, y, z) be the standard coordinate system. In the cylindrical polar coordinates we keep the 
z-coordinate and use the polar coordinates for {x,y), that is 

X = r cos 9, y = r sin 9, z = z 

where < < 27r. The Jacobian matrix is given by 

cos^ — rsin^ 
sin 9 r cos 9 
1 



and the Jacobian = r. The inverse transformation is given by 

r = v/x^ + y^, tan^ = —, z = z. 

X 

The Laplace operator in the cylindrical coordinates is 

A 



^2 1 Id 92 

gj.2 J.2 QQ2 J. Qj. Q^2 



(6.19) 



6.5.3 Spherical coordinates in 



Let {x,y,z) be the Cartesian coordinates for a general point P G M^, P / (0,0,0). Let p be 
the distance between P and O: p = \lx^ -Vy^ + z'^, and let be the angle from ^;-axis to the 
position vector ()P, so that z = p cos 93 where < (/? < vr. Change {x,y) to its polar coordinates 
(r cos 0, r sin 0), where r is the distance from O to the perpendicular projection of P to the xy-plane, 
so that r = p sin ip. In terms of the spherical coordinates (p, p, 9) we have 



/9sin(/9cos0, 
/3sin(^sin^, 
pcos if. 



(6.20) 



where /9>0, 0<99<7r and < ^ < 27r. The Jacobian matrix of the transformation (6.20) can be 
computed directly: 



/ ftr dx_ 
dp dip 
dy dji 
dp dip 
dz 
dip 



\ dp 



dx \ 

de » 

dy 

de 

dz I 

de I 



sin (/5 COS 9 p cos cos 9 —p sin sin ( 
sin sin 9 p cos 9? sin 9 p sin (/? cos 9 
cos (/? — p sin ip 
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Hence the Jacobian is given by 



d{x,y,z) 
d{p,ip,9) 



siiupcosd /9 COS 99 cosy —psimpsmt 
sin (psinO p cos ipsniO p sin ip cos 

cos(p —p sin 99 

sin if cos cos 99 cos 6 — sin 6 
sin sin ^ cos ip sin cos 
cos ip —smp 

cos ip cos ^ — sin 
cos p sin cos 6 



p sm p I cos 



+ sini^ 



p sm.p \ cos p 
2 



cos ( 
sin( 



— sm ( 

COS0 



+ sin p 



sin ip cos 9 — sin ( 
sin ip sin 9 cos 

cos — sin 9 
sin ^ cos 9 



= p smp. 

The inverse transformation can be worked out as the foUowing 



i&nip 
tan0 



Finally let us consider the Laplace operator in 

A 



dx^ d'lp- dz^ 



and we wish to write the Laplace operator in the spherical coordinate system. Suppose /(x, y, z) 
has continuous derivatives up to second order. First, we use the cylindrical coordinates x = rcos9, 
y = rsm9, z = z. Then, according to (6.17) 



ay ^ ay _d'^f ^ idf ^ 1 

dx'^ dy^ dr^ r dr 89^ 



(6.21) 



Next, we use the change of variables: z = p cos ip and r = psinp. Notice that {p,ip) are the 
polar coordinates for {z,r), thus, according to (6.14) 



cos</?|| + sinv9|^. 



dp 

II = -psmip%+pcosp%, 



and, according to (6.17) 



Q2f Q2f Q2f I Qf 1 Q2f 



dz^ dr^ 



+ 



+ 



dp'^ p dp p^ dip^ ' 



(6.22) 



(6.23) 



Putting (6.21) and (6.23) together to obtain 

A/ 



5V ^52/^ la/ ^ 1 



dr 89"^ 



Qj.2 

dp^ p dp p^ dip^ r dr 



+ 



19/, 1 , 1 9/ , I (Pi 

d9^ 



+ 



+ 



+ 



(6.24) 
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On the other hand, by solving |^ from (6.22) we have 



df . df cos (fdf 



and substitutmg it into (6.24) wc finally obtain 

d'^f idf 1 d^f 

A/ = — - H -\ 

dp^ p dp p^ dip^ 



1 

+—. 

psmif 



sin I cosy a/' 
dp p dip 



r2 



ay I ^ I ^ I ^ cotyg/ 

5/9^ p2 5(^2 cos^ 80'^ p dp p^ dip 

That is, under the spherical coordinate system the Laplace operator in can be written as 

^2 1 1 52 2 9 cot V? a 

ap^ p^ Oip^ p'^ cos^ 99 at/^ p ap 099 
6.6 Some simple partial differential equations 

An equation involving several variables, functions and their partial derivatives is called a partial 
differential equation (abbreviated as PDE or PDEs for simplicity). Some important examples of 
PDES may be listed as the following 

• Laplace's equation (in steady-state temperature, electrostatic potential, fluid flow etc.) 

d'^u d'^u d'^u _ ^ 
dx^ dy"^ dz^ 

In the cylindrical coordinates r,9,z given by x = r cos 0, y = rsin^, z = the equation is 

Qj,2 J, Qj. j,2 QQ2 Q^2 

If we use the spherical coordinates p,ip,9 defined by x = p simp cos 9, y = psiinpsm.9 and 
z = p cos if, then Laplace's equation becomes 

d'^u 2du 1 d'^u cot ip du esc ip d'^u 

1 \ \ \ = 0. 

dp'^ p dp p^ dip"^ p^ dip p2 QQ2 

• Heat equation (modelling the distribution of temperature, the diffusion of heat) 

du 2 ( •5^'" d'^u\ 
dt \dx^ dy'^ dz'^ J 

in which is a constant called termal diffusivity. 
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Schrodinger's wave equation in quantum physics 

h 

-—Aip + V(x)ij = Ell) 
2m 

where h is the Planck's constant, m is the mass, V is the potential energy and E is the energy. 
tp represents the wave function. For further reading, see for example, A. P. French and E.F. 
Taylor: An Introduction to Quantum Physics. 

Black-Scholes partial differential equation in finance 

dV 1 2 02^^^ n9V ^ 

m^r'as^^'-' as = 

where 5* represents the current value of an asset, r is the interest rate for a riskless account, 
(T^ represents the volatility. For further reading, see for example, P. Wilmott, S. Howison and 
J. Dewynne: The mathematics of financial derivatives - a student introduction. 

Maxwell's equations for electromagnetic fields 

V • E = 0; V • B = 0; 

d 1 <9 

-B = -VxE; ^-E = VxB-MoJ 

dt at 

where E is the electric field strength, B the magnetic field strength, J the volume current 
density, c is the speed of light, and /xq the permeability of vacuum. For more details, see for 
example, P. Lorrain, D. Corson and F. Lorrain: Electromagnetic fields and waves. 

Navier-Stokes equations for incompressible flow of fluid, which is a system of PDEs for the 
velocity vector field u = (n^, n^, u^) and the pressure p: 

dw 

— + u • Vu = z^Au - Vp, V • u = 0, 
dt 

where the contraction u • Vu is the directional derivatives of the velocity u in the direction 
u. That is 

u • Vu = (u • Vn\ u • Vtx^, u • Vu^) 

where 

^ ,• ^ ,• 2^^' s^'"' 

u • Vu' = L»uW = fx^— + fx^— + u^ — 
dx dy dz 

for i = 1,2,3. In terms of the components u^,^^,^^ of the velocity u, the Navier-Stokes 
equations are 

du^ 1 dv} 9 dv} o dv} ^ , dp 

dt dx dy dz dx 
dv? 1 dv? o du^ n du^ ^ n dp 

dt dx dy dz dy 

dv^ 1 dv^ 9 dv^ o du^ ^ dp 

dt dx dy dz dz 

dv^ dv^ du^ Q 
dx dy dz 
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Of course there are many more PDEs which appear in science. In this course we will study 
none of them, and instead wc choose to study some very simple PDEs as examples for which we 
even can find closed forms (i.e. explicit formulate) for their solutions. 



6.6.1 First order linear PDE (in two variables) 

Let us consider the following type of partial differential equations (in two variables) 

P(x,2/)- + Q(x,y)- = (6.27) 

where at least one of P and Q does not vanish. We wish to find some explicit solutions to (6.27). 
The idea is to consider the first order ODE 

- (6.28) 



P(x,y) Q(x,y) ■ 
That is, 

^ Q{x,y) 
dx P{x,y) 

Suppose the general solution to (6.28) is given implicitly by ijj{x,y) = C where C is a constant, 
then the general solution to the PDE (6.28) is given by 

z = ^{^{x,y)) 

where <I> is an arbitrary differentiablc function. In fact, if ip{x, y) = C is the solution to (6.28) so 
that, by differentiating in x we obtain 

&tp _^ &ipdy ^ d^^ ^ ^ 
dx dy dx dx 

Substituting ^ = S into the equation to obtain 



dx dy P 



On the other hand, by the chain rule 



dz ^/dih dz ^,dit> 
— = $ — — — = $ — — 

dx dx ' dy dy 



so that 



P(...)|.Q(.,4^*'(p|^.«|)^o, 

which shows that z = ^{'tp{x,y)) is a solution to the PDE (6.27). 
Example 6.14 Find the solution to the following PDE 



dz dz ^ 
^ dy ^ dx 



such that when x = then z = y'^ 
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First solve the first order ODE 

dx dy 

-y X 

which is separable, and has the general solution given by + = C. Thus the solution to the 
PDE is z = +?/^), where $ is differentiable. We want a solution such that when x = 0, z = 
so that $(?/^) = for any y, hence = t for t > 0, which yields that z = x^ + y^. 

6.6.2 First order quasi linear PDEs (in t\vo variables) 

Similarly, in order to solve the following type of first order linear PDEs 

3z 3z 
Pix,y,z)— + Q{x,y,z)— = Rix,y,z) (6.29) 

we may attempt to solve the following ODEs 

dx dy dz 



P{x,y,z) Q{x,y,z) R{x,y,z) 

which contains three ODES, but in general only two of them arc independent. Choose a pair of 
them, solve them and obtain two solutions which may be given implicitly by 

ipi{x, y,z) = Ci and tp2{x,y,z) = C2. 

Then the general solution to the PDE (6.29) is given implicitly by 

^{'fpi{x,y,z),ilJ2{x,y, z)) = 
where <I? is an arbitrary function with continuous partial derivatives. 

Example 6.15 Find the solution to the following PDE 

dz , 2\ dz 
^5x + ^^ + ^)9y=" 

which satisfies the condition that when x = 2 then z = y — 4. 
Set up the auxiliary ODEs 

dx dy dz 
X y + x^ z 
From which choose two of them, for example 

dx dy dx dz 
X y + x^ ' X z 

The first equation may be written as 

dy 1 

y = x 

dx X 

which is a first order linear equation. ^ is an integrating factor, multiplying ^ to obtain 

_ 1. = jL(y] = 1 

X dx x^ dx \xJ ' 
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integrating the equation to obtain ^ = x + Ci, i.e. 

X 

is the general solution to the first ODE. The second ODE has a general solution 

X 



Therefore the general solution to the PDE is 



^ ,V — x'^ z , 

^- ,- =0, 

X X 



or by solving |, the solution may be written as 

y-x 



X 

2 



z = xf 

where / is an arbitrary differentiable function. If x = 2 then 2; = y — 4 so that 

/■u - 4' 

Thus t = 2/(t/2) (after substitution t = y — 4) so that f{t) = t. Hence the solution to the initial 
problem is given hy z = y — x"^. 

6.6.3 Method of separation of variables 

Some PDEs (in two variables) have special type of solutions which are separable, i.e. in a product 
form u{x,t) = g{x)h{t), for which we may attempt to make substitution: u{x,t) = g{x)h{t) to 
reduce the PDE to ordinary differential equations for g and h. 

Example 6.16 (The heat equation) Consider the one- dimensional heat equation: 

du{x, t) _ d'^u{x, t) 
dt ~ y dx"^ 

where a > is a constant. 

By an inspection, we can see that the Gaussian probability function 

/ , 1 _ -2 

u{x,t) = e 2?zt for t > 

\'2iTa'^t 

is a positive solution to the heat equation for t > 0. 

Let us search for solutions u{x, t) which are separable. To this end, make substitution u{x, t) = 
g{x)h{t). Since 

du(x,t) / ^,;/ N , d^u(x,t) „, , , 
— ^ = g{x)h'it), and ^^J ' = g"{x)h{t) 
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thus, the heat equation becomes 



g{x)h'{t) = ^g"ix)hit). 



Separate the variables to obtain 

h'{t) a^g"{x) 



A 



h{t) 2 g{x) ■ 

Since only depends on t, while the equation implies that it only depends on x, therefore 

must be a constant function. Similarly, is a constant independent of x or t. Therefore we 

must have 

h'jt) _ g"{x) 
h(tj ~ Y^{x) 

where A is a constant. The heat equation is thus transformed to a system of second order linear 
ODEs 

h'{t) = Xh{t), g"{x) = -4g{x). 

(7 

The first ODE has a general solution h{t) = C\e^* and the second ODE has a general solution 

1. If A > 0, then 

2. If A < 0, then 



g{x) = C2COS I j +C3sin ( y "^a; 



3. If A = 0, then 

g{x) = C2 + C3X. 



7 Gradient vectors, normal vectors to surfaces 

In this part we consider curves and surfaces in M^. For simplicity, let iis declare that all functions 
we will encounter in this part are defined on open subsets (unless otherwise specified), and have 
continuous partial derivatives. 

The graph of a function y = f{x) defined on (a, b) is a curve in the plane M?. The derivative 
f'{xo) measures the slope of the line tangent to the graph at {xq, f{xo)): tana = f'{xo) where a 
is the angle from the .x-axis to the tangent line at {xq, /(xq)). The equation for the tangent line at 
{xq, f{xo)) is a linear equation 

y - f{xo) = f'{xo){x - Xq). 

The graph of y = f{x) has a natural parameterization: we may write the coordinates (x, y) on 
the graph as the following 

x = t, y = f{t) (7.1) 
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and consider t G (a, b) as a parameter. The mapping t — >■ {t, f{t)) is called a parameterized curve in 
the plane. Similarly, the tangent line at (xo,/(a;o)) has a natural parameterization, namely given 

as 

x = t, y = f{xo) + f'{xo){t - xo), 

where (x, y) is a general point lying on the tangent line. In terms of vector notations, it can be 
written as 

[x -xo,y- f{xo)) = (1, f'{xo)){t - Xo) 

i.e. (x — xq, y — /(xo)) is parallel to the vector (1, /'(xq)) which is called the tangent vector of the 
parameterized curve. Note that the first coordinate 1 appears as ^ = 1, so the tangent vector at 
{x{t),y{t)) can be written as {x' (t) , y' (t)) , where x{t) = t and y(t) = f{t) for the graph oiy = f{x). 

Generalized this notion to give the concept of parameterized curves in the plane R^. That is, a 
parameterized curve 7 in the plane M? is a mapping t — >■ {x{t),y{t)), i.e. 

X = x{t), y = y{t), 

where t G (o, b) (some interval) is a parameter. Since ^ = , so the tangent line to the curve 7 
at a point (x(to), y(io)) has a parameterization 

r x = x(to) + x'(io)(i-io); 
\ y = yito)+y'{to){t-to) 

which represents the line passing through (x(to), y(io)) with slope (x'(fo), y'(io))- 

A curve in can also be described implicitly by an equation such as F{x, y) = 0. You should 
be familiar with the standard quadratic curves such as circles, ellipses, parabolas and hyperbolas 
(for a revision you may refer to Richard Earl's notes). 

Example 7.1 Consider an ellipse defined implicitly by the equation 

x2 y2 ^ 

which has a parameterization defined by 

X = a cos t, y = b sin t 

where <t < 27r. A tangent vector at {acost, bsint) is thus given as {—asint, boost). 

A parameterized curve 7 in the space may be described by a vector valued function of one 
variable t, i.e. a mapping t ^ ^{t) where 

^{t) = ix{t),y{t),z{t)), te{a,b). (7.2) 

The tangent vector to the curve at 7(^0) is the vector [x' {to) , y' {to) , z' (to)) and the line tangent to 
the curve at {x{to),y{to), z{to)) is given by the equation 

r = 7{to)+l'{to){t-to). 

that is 

X = x{to) + x'{to){t - to), 

' y = y{to) + y'{to){t-to), 
^ z = z{to) + z'{to){t-to). 
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7.1 Normal vectors, tangent planes 

Let us consider smooth surfaces in R^. As for the case of curves, the graph of a function z = f{x, y) 
on a domain U is considered as a parameterized surface 

{x,y) ^ {x,y,f(x,y)), {x,y)eU. (7.3) 

By relabel the variables, the graph of z = f{x, y) is a parameterized surface defined by the mapping 
{u,v) — >■ {u,v, f{u,v)) where {u,v) as two parameters. In general, a mapping 

{u, v) {x{u, v),y{u, v), z{u, v)) (7.4) 

where {u, v) runs through an open subset U C is called a parameterized surface in the space 
R^. The mapping or the parameterized surface is often written as 

x = x{u,v), y = y{u,v), z = z{u,v). 

When [u, v) runs through a subset U, then {x{u, v),y{u, v), z{u, v)) draws out a surface in the space 
the image of U under the mapping (7.4). 

A surface S may be described by an equation 

F{x,y,z)=0, (7.5) 

where, in order to avoid technical difficulty, we assume that VF ^ 0. For example, a sphere: 
x'^ + y'^ + z'^ = which has a parameterized representation in terms of spherical coordinates {ip, 6) 
[Notice that the equation of the sphere in spherical coordinates takes a simple form: p = R\ 

X = R sin cos 9, y = R sin </? sin 6, z = R cos ip, 

where p € [0, vr] and 9 G [0, 27r) are two parameters. 

The graph of a function z = f{x, y) is a surface which is a parameterized surface, but also can 
be described by the equation 

z- f{x,y) = 0. 

Let us now define the concept of the tangent plane at a point on the surface. Let P = 
ixo,yo,zo) € S the surface defined by the equation (7.5) and 7(t) = {x{t),y{t), z{t)) be any 
parameterized curve on the surface S passing though the point P, say 7(0) = {xq, yo, zq) = P. 
Then 

F{x{t),y{t),z{t)) = 
so, by differentiating in t at t = 0, employing the chain rule, we obtain 

Fx{xo, yo, zo)x'{0) + Fy{xQ, yo, zo)y\0) + F^{xo, yo, zo)z'{0) = 0. (7.6) 

Recall that V-F = {F^, Fy, F^) is the gradient vector field of F, so we may rewrite (7.10) as 

VF(xo,2/o,2o) -7(0) = (7.7) 

which says the tangent vector 7'(0) is perpendicular to the gradient of F. Since 7 can be any curve 
on the surface S, thus 7'(0) can be any vector tangent to the surface S at P, so (7.7) means that 
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any vector tangent to the surface 5 at P is perpendicular to the gradient vector VF{xo,yo, zq), 
and therefore all tangent vectors to S at the point P lies on the plane passing through P and 
perpendicular to VF{xo,yo, zq), which is called the tangent plane to S at P. We therefore call 
VF(a;o, yo, zo) a normal vector to the surface S at P. 

Suppose that {x, y, z) belongs to the tangent plane at P, so that (x — xq, y — yo,z — zq) lies on 
the tangent plane, so it must be perpendicular to the normal vector VF{xo, yo, zq), thus 

Fx{xo, yo, zo){x - Xq) + Fy{xo, yo, zo){y - yo) + F^{xo, yo, zo){z - zo) = (7.8) 

which is the equation for the tangent plane to S at P = (xq, yO) -^o)- 

If the surface S is the graph of a function z = f{x, y), so that we may take F{x, y, z) = z—f{x, y) 
thus a normal vector at (x, y, /(x, y)) is the gradient vector of F which is {—fx, ~fy, !)• Therefore 
an equation for the tangent plane to the graph oi z = f{x,y) at {xo,yo, f{xo,yo)) is given by 

-fx{xo,yo){x - xo) - fy{xo,yo){x - xo) + z - Zo = Q (7.9) 

which we have already seen before. 

Consider on the other hand a parameterized surface S: which is described by a vector valued 
function of two parameters {u,v): 

X = x(n, v), y = y{u, v) and z = z{u, v) (7-10) 

where {u,v) runs through an open subset D C M^. Let {uo,vo) G D and 

P = {xo, yo, Zq) = ix{uo, vo), y{uo,vo), z{uo, vo)) 

is a point on the surface S. Consider the parameterized curve 71 defined by 

71 (n) = {x{u,vo),y{u,vo),z{u,vo)) 

and the parameterized curve 72 defined by 

72(f) = {x{uo,v),y{uo,v),z{uo,v)) 

where u (rcsp. v) is considered as a parameter. Then both curves 71 and 72 lie on the surface 
and pass through P, and the tangent vectors 7i(tto) and 72(^^0) are two tangent vectors to the 
parameterized surface S at P, thus, by definition 7i(tto) x 72(^0) (cross product of two vectors 
7((no) X 72(^^0)) is a vector perpendicular to the both vectors j[{uo) and 72(t'o) [Geometry I, 
Prelims] and therefore ^[(uq) x 72(^0) is a normal vector to the surface S. On the other hand, by 
definition of partial derivatives 

, 97 f dx dy dz\ , O7 / dx dy dz\ 
du \du du du) ^ ^"^ dv \dv^ dv' dv J 

where ^{u,v) = {x{u,v),y{u,v), z{u,v)). Thus ^ x |^ is a normal vector to the parameterized 
surface S, which is given by according to the definition of the cross product 

^7 dj 
du dv 



i 


j 


k 


dx 


dy 


dz 


du 


du 


du 


dx 


dy 


dz 
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The tangent plane to S at 7(1x0; '^^o) has an equation 

dj{uo,vo) d-f{uo,vo) 



du 



dv 



{r - -i{uQ,VQ)) = 



(7.11) 



where r = {x, y, z) is the position vector on for a general point in the tangent plane. In terms of 
the Cartesian coordinates, (7.11) can be written, by working out the dot product, as 



X- xq y -yo z - zq 

dx dy dz 

du du du 

dx ay dz 

dv dv dv 







(7.12) 



where the partial derivatives are evaluated at {uq,vq)^ and (xo,yO)-2o) = 7(no,vo)- 
Example 7.2 The sphere with radius R> may be described implicitly by the equation 

x'^ + y'^ + z^ = B?, 

so a normal vector to the tangent plane at {xq, yo,zo) is V/(xo, yo,zo) = 2(xo, yo,zo) which has the 
same direction as the coordinate vector, and the tangent plane has an equation 

xo{x - xo) + yo{y - yo) + zo{z - zq) = 0. 

Since the point {xo,yo,zo) lies on the sphere so that the equation can be simplified as 

xqx + yoy + zqz = . 

The sphere may be parameterized via the spherical coordinates which are is given as the parameter- 
ized surface 

X = Rsinip cos 6, y = RsiinpsinO, z = R cos tp 
where < cp < tt and < 9 < 27r, hence the tangent plane at a point {xq, yo, zo) has an equation 



x-xo y -yo z- zo 

dx dy dz 

dip dip dip 

dx dy dz 

d0 d0 de 



X - Xo y - yo z - Zo 

i?cos(^cos0 i? cos sin 6* —Rsimp 
—RsmipsinO RsirupcosO 



X - Xo y - yo z - Zo 
i?^ cos cos ^ COS (p sin 9 — sin<^ 
— sin (p sin 6 sin (p cos 9 



which may be simplified as the following 

sin (p cos 9 {x — xq) + sin ip sin 6 {y — yo) + cos ip{z — zo) = 0. 
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7.2 Directional derivatives 

Suppose that F{x,y,z) has continuous partial derivatives, then its gradient vector by definition is 
VF = {Fx, Fy, Fz). Suppose 7(t) = (where t G (a, h)) is a parameterized curve with 

continuous derivatives, then 

f{t) = F o^{t) = F{x{t),y{t),z{t)) 
is differentiable, and, by chain rule, 

^ = Fxx'{t) + Fyy'{t) + Fzz'{t) 
= VF(7(t))-y(t). 

In particular, if v = {vi,V2,vs) is a no-zero vector, and take 

7(t) = {xo + vit, yo + V2t, zo + v^t) 

the line passing through [xq, yo,zo), then the derivative 

|fo7(0) = VF-7'(0) 

= VF{xo,yo,zo) -v 
= viF^ + W2Fj^ + v^Fz 

is called the directional derivative of F in v, denoted by D^F, hence 

D^F = • V. 

By definition 

n NT F{xo + vit,yo + V2t,zo + V3t) - F{xo,yo,zo) 
D^F{xo, yo, zo) = lim . 

The previous discussion can be stated as the following 

Proposition 7.3 Suppose that F{x, y, z) is a Junction on an open subset U (I'M? with continuous 
partial derivatives, and ^{t) is a parameterized curve in U with tangent vector v at 7(0), i.e. 
7'(0) = V, then 

j^F o 7(0) = D^FijiO)) = VF(7(0)) • v. (7.13) 

8 Taylor's theorem 

Suppose that f{x) is a function defined on [a,b] with derivatives of any order. For a given natural 
number n we search for a polynomial in (x — a) of degree n 

Pn{x) = ao + ai{x - a) -\ h a„(a; - a)" 
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so that f{x) agrees with Pn{x) up to nth order derivatives at a, that is f^^\a) = p^''\a) for 
k = 0,1, - ■ ■ ,n. Since p^^\a) = k\ak for A; = 0, • • • , n we obtain = ^f^''\a) and therefore 

Pnix) = fia) + f'ia){x -«) + ••• + ^^-^{x - ar (8.1) 

which is called Taylor's expansion (of order n) for / at the point a. We have the following theorem 
which will be proved in Prehms Analysis II in Hilary term. 

Theorem 8.1 (Taylor's theorem for one variable function) Suppose f{x) has derivatives at a up 
to nth order, then 

fix) = fia) + f'ia)ix -«) + •••+ ^^-^ix - a)- + o((x - a)")) (8.2) 

as X a [the right-hand side is called Taylor's expansion of f at a with Peano's remainder]. That 
is 

lim li^^tiP^ = 0. 

(x — a)" 

Taylor's theorem says the Taylor expansion of nth order is a good approximation of / near a 
up to (x — a)". 

We can have better estimate for the difference fix) — p„(x) if / has derivatives on [a, b] up to 
(n + l)th order. Namely we have 

Theorem 8.2 (Taylor's Theorem) Suppose fix) has derivatives on [a,b] up to (n + l)th order, 
then for any x G (a, b] there is ^ E (a, x) such that 

fix) = fia) + fia)ix - a) + • • • + ^^-^(x - a)" + ^ ^ (x - a)^+\ (8.3) 

Tht ITT- I -L )• 

In particular, if / has derivatives of any order, and if 

^ (6 - a)" ^ as n ^ oo 
n! 



where Mn = sup[„^b] \f^''\x)\, then 



fix) = fia) + /'(a)(x - a) + • • • + ^^^^(x - a)" + • • • Vx G [a, b]. 

n! 

For example, we can easily see that 

cosx = l-|j -••• + (-l)"^ + --- VxG (-00,00). 

Let us now consider a function fix, y) of two variables defined on an open subset U. Suppose 
ixo,yo) G U. We search for a Taylor type expansion of /(x,y) near (xo,yo)- Let us assume that 
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/ has continuous partial derivatives up to order n. Let {x,y) G U close to {xo,yo) so that the line 
segment 

7(t) = {l-t){xo,yo) + t{x,y) 

= {xo,yo)+t{x-xo,y-yo) 

(where t G [0, 1]) between (xo,yo) and {x,y) lies in U. Consider one variable function 

9{t) = foj{t) te[o,i]. 

Then g has derivatives on [0, 1] up to order n, so we can apply Taylor's Theorem at a = 0. To 
simplify our computations below we introduce vector notation v = (x — xo,y — yo). We want to 
calculate ^^'^''(0) for k = 0,1, • • • . Clearly ^'(0) = /(xq, yo) and 

5'(0) = V/(xo,yo)-v 

as we have seen in the previous sections. In general 

g'it) = V/(7W)-7'(t) = V/(7(t))-v 

= fxilimx - xo) + fyi^imy - yo) 

so that, by differentiating in t again to obtain 

g"{t) = (x-xo)V/,(7(t))-v + (y-yo)V/,(7(t))-v 
= {x- xo) {fxx{i(t)){x - xo) + fxy{i{t)){y - yo)) 

+{y - yo) {fyx{l{t)){x - Xq) + fyy{l{t)){y - yo)) 
= fxx{l{t)){x - xo)^ + fxy{i{t)){x - xo){y - yo) 
+fxy{l{t)){y - yo){x - Xo) + fyy{j{t)){y - yo)^, 



and from which we can see the pattern for kth. derivative, namely 



i+j=k 
i,j>0 

and therefore 

i+j=k ^ 

i,j>0 

According to Taylor's theorem for one variable function applying to 5 at a = 0: 

f{x,y) = f{xo,yo)+y:i, { ' V'£!::tf^ ^-xor(^-i/oF 



/(..,»,.giE(^^) 



dx'^dy^ 

+o(|(x - xo,y - yo)|) (8.5) 
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as {x,y) — )■ (xo,yo)- If / has partial derivatives on U up to (n + l)th order, and the segment 
between {xo,yo) and {x,y) lies in U, then there is 6 € (0,1) (depending on n, {xo,yQ), {x,y) and 
the function /) such that 

f{x,y) = f{xo,yo) + ± E ^^W^^"-"'^)'^^-^^^'' 

k=ii+j=k ^ 

i,j>0 

i+i=n+l 
i,3>Q 

where 

^ = 0(xo,yo) + (l-^)(x,y). 

The right-hand side of (8.6) is called the Taylor expansion of two variable function f{x, y) at (xq, yo)- 
To memorize this formula, you should compare it with the Binomial expansion 

i+j=k 

i,j>0 

which corresponds to the A;th derivative term in the Taylor expansion. But notice that the com- 
bination numbers in binomial expansion are ^ but in Taylor's expansion, they turn out to be 

1 

It is particularly interesting for n = 1. For simplicity, suppose U = Bji{xo,yo) is an open 
disk centered at (.xo,yo) with radius R > 0. Suppose all first and second partial derivatives are 
continuous on U. Then for any {x,y) € U there is ^ G C/ such that 

f{x,y) = f{xo,yo) + Vf{xo,yo)-{x-xo,y-yo) 

+ 1 [fxxiOi^ - xof + 2UyiO{x - xo)(,y - yo) + fyyiOiv - Vof] ■ (8.7) 

The remainder in (8.7) appears as a quadratic form in x — xq and y — yo with coefficients the second 
partial derivatives, which can be written in terms of matrix multiplication as 

ix-xo,y-yo)( f'O f T T ) " 
V Jxy Jyy J \ y-yo J 

The square matrix 

(fxx fxy \ 
fxy fyy J 

is called the hessian matrix of /, denoted by f . 

Similarly we may write down the Taylor expansion for a several variable function. Suppose 
/(xi,--- ,Xfe) is defined on a ball Br(«) centered at a = (ai,--- ,ajfc) with radius r > 0, with 
continuouse partial derivatives of any order. Then, for any x = (xi, • • • , x^) and n = 1, 2, . . . , we 
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have 



/(x) = /(a) + V/(a).(x-a) + 



0<ii ,ifc<n 

+o(|x-ar+i) 



as X ^ a, where 



n \ n! 



9 Critical points 

In this part we apply Taylor's theorem to the study of multi-variable functions near critical points. 
For simplicity, we concentrate on two variable functions, though the techniques we are going to 
develop apply to several variable functions with necessary modifications. 

First of all we introduce the notions of local extrema. Let f{x,y) be a function defined on a 
subset A cM?. Then a point (xo,yo) G ^ is a local maximum (resp. local minimum) of / , if there 
is an open ball Br(xo, yo) C A for some r > such that 

f{x,y)<f{xo,yo) y{x,y) e Br{xo,yo) (9.1) 

(resp. 

f{x, y) > f{xo, yo) V(a;, y) G Br{xo, yo)). (9.2) 

On the other hand, we say (xo,yo) £ A is a (global) maximum (resp. (global) minimum) if 
f{x,y) < f{xQ,yQ) (resp. f{x,y) > f{xo,yo)) for every {x,y) G A. We should note that a global 
maximum (or a global minimum) for a function is not necessary a local one, for example consider 
the function f{x,y) = x"^ + y"^ defined on ^ = {(x, y) : + < 1} the closed unit disk. Then 
every point on the unit circle is a global maximum, but not local one. 

Theorem 9.1 (Fermat) Suppose that f{x,y) defined on an open subset U has continuous partial 
derivatives, and {xo,yo) is a local maximum (or a local minimum), then 

df{xo,yo) _ df{xo,yo) _ , . 

dx ~ dy ^^-^^ 

That is the gradient vector V f{xo, yo) = 0. 

Proof. Consider the local maximum case. There is e > such that Bi;{xo,yo) C U and (9.1) 
holds. For any unit vector v = (fi,f2) and let 7(t) = (xo,yo) + ^v. Consider one variable function 
g{t) = / o 7(i). Then g(t) < g{0) for any t £ {—e,e) and ^'(0) exists by the chain rule. On the 
other hand 

,'(0) = hm^i^l^<0 
^ ^ ^ t^o t 

t>0 
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and 

,'(0) = lim^M^>0 
^ ^ ^ t^o t 

t<0 

so we must have ^'(O) = 0. While, ^'(O) is just the directional derivative of / in v = (^1,^2) so that 

n \ df{xo,yo) df{xo,yo) „ 
D^f{xo,yo)=vi ^ +V2 ^ =0 

for any unit vector (^1,^2), which yields (9.3). ■ 

Any point (xo,yo) such that V/(xo,yo) = is called a critical (or stationary) point. Fcrmat's 
theorem says local cxtrcma must be stationary points. Therefore we search for local extrema among 
the stationary points. Taylor's expansion allows us say more about whether a stationary point is a 
local extreme point or not. 

To this end, we have to look at the remainder term which appears in Taylor's expansion, i.e. 
the term 

fxxiOi^ - Xof + 'ifxy{i){x - Xo){y - yo) + fyyiOiy " Vof ■ 

By considering the quadratic function aX^ + 2cA + b whose discriminate is 4(c^ — ab), we have 
the following 

Lemma 9.2 1) If — ab < and a > (so b > as well) then 

aX^ + 2cA// + bfi^ >0 

and equality holds if and only if X = /i = 0. 

2) lfc^-ab<0 and a < (so b < as well) then Av v <0 

aX^ + 2cXfi + 6//^ < 

and equality holds if and only if X = /j, = 0. 

Together with Taylor's expansion we are now in a position to derive further information about 
stationary points. 

Theorem 9.3 Suppose that f(x, y) defined on an open subset U has continuous derivatives up to 
second order, and suppose {xo,yo) E U is a critical point: Vf{xo,yo) = 0. 
l)If 

'd '^f{xo,yo) y d^f{xo,yo) d'^f{xo,yo) d'^f{xo,yo) ^ ^ 

dxdy J dx-^ dy'^ ' dx"^ 

then {xo,yo) is a local minimum. 



2) If 

\ dxdy J dx"^ dy"^ ^ dx'^ 

then (xo,yo) is a local maximum. 



d'^f {■'■{). yo) \ O'^fixu. yo) y-/(.r(). yo) d'^f{xo,yo) ^ „ ,^ 

< 0' 7^ < (9-5) 
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Proof. Since all partial derivatives up to second order are continuous, we can choose a small 
e > so that the open disk Bs{xo, yo) C U and (9.4) (resp. (9.5)) hold not only at a = (xo, yo) but 
also at any point in i?e(a). For any x G B^^a.), according to Taylor's theorem, there is ^ G -^^(a) 
(though depending on x) such that 

/(x) = /(a)+V/(a).(x-a) 

+fxx{0{x - xo)' + 2Uy{0{x - xo){y - yo) + fyyiOiv - yo? 

Suppose (9.4) holds, so it holds on ^^(a) for small e > 0, so that 

fxx{0{x - xo)^ + 2Uy{0{x - xo){y - yo) + fyyiOiv - yo)' > 

which yields /(x) > /(a) on B^{a) so a is a local minimum. ■ 
A natural question is, of course, what can we say if 

/ d'^f{xo,yo) y _ d'^f{xo,yo) d'^f{xo,yo) ^ ^ 
\ dxdy J dx"^ dy^ ~ 

If 

^a V(a;o,j/o) y d'^f{xo,yo)d'^f{xo,yo) 
dxdy J dx^ dy"^ 

then, based only on the information about the first and second partial derivatives at (a;o,yo)) we 
can not know the sign of 

fxx{i){x - xo)' + 2/a;y(0(x - xo){y - yo) + fyyiOiv - Vof 

appearing in the Taylor expansion, so we are in this case unable to tell whether (xo,yo) is a local 
extreme point or not. 
On the other hand, if 

c?'/(xo,yo)y 9V(xo,2/o) 5'/(xo,yo) ^ Q 



dxdy J dx"^ dy"^ 

then, by continuity, the same inequality remains to hold on a small disk near (a;o,yo), and thus 

fxx{i){x - Xo)' + 2f^y{C){x - xo){y - yo) + fyyiOiv - yo)' 

is indefinite, i.e. it can take both positive and negative values, so in this case the stationary point 
(xo,yo) is not a local extreme point, such a critical point is called a saddle point. 

Example 9.4 Consider f(x,y) = s'mx + siny — sin(a; + y). Find the maximum and minimum 
values of f on the triangle enclosed by the x-axis, y-axis and the line x + y = 27r. 

The triangle is bounded and closed, and / is continuous, so / achieves its maximum and 
minimum values. The global extrema must lies on the boundary of the triangle, i.e. x = 0, 
< y < 27r; y = 0, < X < 27r; x + y = 27r, < x, y < 27r, or lies in the interior of the triangle. In 
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this case, a global extreme point must be a local one, hence must be critical points of /. Hence we 
first locate the possible critical points inside the triangle by solving the following system 



dl 

dx 
dl 
dy 



= cos X — cos(x + y) = 0, 
= cos y — cos(a; + y) = 0, 



to obtain only one critical point ^) and /(^, ^) = On the other hand on the 

boundary f{x,y) = Oso ^) is the global maximum. 



Since 



5V 

dy^ 



= — sinx + sin(x + y), = — siny + sin(a; + 2/), 

dxdy 



sin(a; + y), 



at ^), the discriminate 



27r 



D = sm ( — ) — — sm — + sm 



= 7 -3 < 
4 



47rY 



and 



. 27r . 47r 

- sm h sm — 

3 3 



-V3 < 



so that (^, ^) is a local maximum, 

There is a generalization to several variable functions. To this end we have to borrow a notion 

about symmetric matrices from the linear algebra. We say an n x n symmetric matrix A = {aij) 
(where a^j = aji for any pair (i, j)) is positive definite (resp. negative definite) if 



Av ■ V = aijViVj > (resp. < 0) Vv = [vi, ■ ■ ■ ,Vn) G M", 



(9.8) 



and equality holds if and only if v = 0. 

For a function f{xi,- ■ ■ ,Xn) of n variables with continuous partial derivatives up to second 
order, then the hessian matrix D^f is an n x n matrix with entry q^J^ . at the ith. row and jth 
column, i.e. 

/ ^ 



dxf 



\ dxndxi 



dxidxn 



dx^ 



(9.9) 



which a symmetric matrix-valued function. 
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Theorem 9.5 Suppose /(x) is a function with n variables x = [xi, ■ ■ ■ ,Xn) defined on an open 
subset U C which has continuous partial derivatives up to second order. Let a = (ai, • • • ,an) 
be a critical point: V/(a) = 0. 

1) If the hessian matrix D'^ f{sL} is positive definite, then a is a local minimum of f. 

2) If the hessian matrix f{a) is negative definite, then a. is a local maximum of f. 

The proof follows from a discussion via Taylor's expansion at the critical point a. 

10 Lagrange's multipliers 

In this part we develop a method of locating relative local extrema. Let us first consider the question 
with three variables, and consider the following problem. Let f(x, y, z) be a function defined on a 
subset U C M^. We wish to locate the local extrema of f{x, y, z) subject to the following constraint 

Fix,y,z) = 0. (10.1) 

We say {xQ,yQ, zq) G U is a (relative) local minimum subject to (10.1) if F{xq, yo, zq) = and there 
is a small ball centered at {xo,yo,zo) with radius £ > such that f{x,y,z) > f{xo,yo,zo) for 
every {x,y,z) G which satisfies (10.1). 

Theorem 10.1 Let f{x,y,z) and F[x,y,z) be two functions on an open subset U C M^. Suppose 
that both functions f and F have continuous partial derivatives, and the gradient vector field VF / 
on U . Let (xq, yO) -^^o) & U be a local maximum or local minimum of f{x, y, z) subject to the constraint 
(10.1). Then there is a real number A such that V/(xo, yo, zq) = WF(xq, yo, zq). 

Proof. Since VF ^0 onU, the equation (10.1) defines a surface 

S = {ix,y,z) GU:F{x,y,z) = 0}. 

By assumptions, (xq, yo, zq) € S" is a local maximum or minimum of the restriction of the function / 
over S. Given any differential curve 7(i) = {x{t),y{t), z{t)) lying on the surface S, passing through 
{xo,yo,zo), i.e. 

Fo7(t) = VtG (-£,£), 7(0) = ixo,yo,zo), 

and consider h{t) = foj{t). Then by the definition of relative local extrema, is a local maximum 
or minimum of the function h{t). Therefore, by Fermat's theorem, h'{0) = 0. On the other hand, 
according to the chain rule 

/i'(0) = V/(7(0))-7'(0) = 0, 

which means that V/(xo, yoj -^o) is perpendicular to 7'(0). Since 'y(t) is any curve lying on the 
surface S passing through {xo,yo, zq), so that 7'(0) can be any tangent vector to S at {xo,yo, zq). 
Therefore V/(a;o, yo; ^o) must be perpendicular to the tangent plane of S at {xo,yo, zq). It follows 
that V/(xo, yo, Zq) cither equals or V/(a:o, yo, zq) ^ is normal to S at (xo,yo, zq). On the other 
hand a normal vector to S at {xo,yo, zo) is VF{xo,yo, zo), therefore V/(xo, yo, zo) and VF(xo, yo, zo) 
are parallel. Since VF{xo, yo, zo) ^ 0, so there is A such that V/(a;o, yo, zo) = XVF{xo, yo, zo)- ■ 
As a by-product, we have proved that if {xo,yo, zo) G is a relative local maximum or minimum 
of f{x,y,z) along S (i.e. satisfying the constraint (8.5)), then V/(xo, yo; -^o) is perpendicular to 
the level surface S : F{x, y, z) = 0. 
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According to the previous theorem, in order to find the constrained extrema of / we should 
look among those (x, y,z) eU and real number A which satisfy the following system 



Vf{x, y, z) = XVF{x, y, z), 

F{x,y,z) = 0. ^'^-'^ 

[We often assume that VF{x,y, z) 7^ 0]. Of course we are interested in those {x,y,z) G U such 
that there is a real number A which solve the system (10.2). In practice, wc need to solve (x, y, z), 
but there is no need to know the explicit value A. The constant A introduced here to help us to 
locate the relative extrema is called a Lagrange multiplier. 

Introduce a function G{x,y, z, X) = f{x,y,z) — \F{x,y,z). Then the system (10.2) may be 
written as 

dG _dG _dG _dG _^ 
dx dy dz d\ 

which means a solution (x, y, z, A) to (10.2) is just a critical point of G{x, y, z, A). 
Example 10.2 Maximize f{x, y, z) = x + y subject to the constraint x"^ + y"^ + z^ = 1. 
To use the method of Lagrange multipliers, set 

G{x, y,z,\) = x + y-\{x^ + y^ + z^ -l) 
and look for the critical points of G by solving the system 



dG 

dx 
dG 

dy 
dG 

'dz 

dG 



= 1 - 2Aa; = 0, 
= l-2Ay = 0, 
= -2Xz = 0, 



+ 77^ + - 1 = 0. 



The first equation implies that A / 0, so from the first and second equations, we obtain z = 0, 
X = y = ^ , substituting them to the constraint to obtain 



2A / V 2A 



+ 1:^] + 0^ = 1 



so that = Thus there are two possible relative extrema \ 0) and 0)- 

Since the sphere 5 : x^ + + 2;^ = 1 is compact (bounded and closed), and the function /(x, y, z) = 
X + y \s continuous, so it must achieve it maximum and minimum values [We will prove this kind 
of statements in Prelims Analysis II]. Therefore the maximum of / subject to the constraint is 



/( 
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while 



fi-\li-\lo) = -^ 



2' V 2' 

is the constrained minimum values of / over the unit sphere. 

To conclude our discussion, let us describe the general form of the Lagrange multipliers. Suppose 
that f{xi, ■ ■ ■ , Xn) and Fi{xi, • • • , Xn), ■ ■ ■ , Fk{xi, ■ ■ ■ , Xn) are functions with n variables defined 
on an open subset U C M", where n, fc G N. Suppose that /, Fi, - ■ ■ ,Fk have continuous partial 
derivatives. Then the local extrema of /(xi, • • • , Xn) subject to the following constraints: 

Fi{xi, ■■■ ,a;n) = 0, 

Fk{xi,--- = 
are solutions to the following system 

dG dG dG dG 

(10.3) 



dxi dxn d\i dXk 

where 

G(xi,--- ,Xn,Xl,--- ,>^k) = f{xi,---,Xn)-XlFi{xi,---,Xn) 

\kFk{xi, ■ ■ ■ ,Xn), 

the additional constants Ai , • • • , are called the Lagrange multipliers. 

Example 10.3 Find the extreme points of f{x, y, z) = x+y+z subject to the conditions x^+y^ = 2 
and -\- = 2. 

Construct function 

G{x, y, Ai, As) = x + y + - \^_{x^ + _ 2) - A2(y^ + - 2). 

We want to solve, in order to locate extreme points, the following system 

dG 

— = l-2Xix = 0, 
ox 

BC 

_ = 1 - 2(Ai + A2)y = 0, 
dG 

— = 1-2A2^ = 0, 
oy 

together with the constraints + = 2 and + = 2. Ai, Ai, Ai + A2 7^ and x = z = ^ 
and y = 2(Xj+\2) ■ I^om the constraints we deduce that x = ±z which implies that Ai = A2 as 
Ai + A2 7^ 0. Hence x = z = and y = Using again the constraints to obtain that 



2X1 J ^ \4Xi 



which leads to the solutions 3^ = ±4-^/1 . Thus possible constrained extreme points are 

and (-2j|,-Ji,-2. 





at which the function / achieves the relative maximum and minimum values respectively. 
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